develooper Front page | perl.perl5.porters | Postings from August 2013

[perl #117355] [lu]cfirst don't respect 'use bytes'

Thread Previous | Thread Next
From:
Victor Efimov via RT
Date:
August 12, 2013 21:20
Subject:
[perl #117355] [lu]cfirst don't respect 'use bytes'
Message ID:
rt-3.6.HEAD-2552-1376342443-837.117355-15-0@perl.org
sorry, RT corrupted this character in code (even RT don't like Latin1
chars, unlike "wide" chars, like Cyrilic !). I meant this one
http://www.fileformat.info/info/unicode/char/da/index.htm

On Mon Aug 12 14:17:19 2013, vsespb wrote:
> On Mon Aug 12 13:57:25 2013, ikegami@adaelis.com wrote:
> > On Mon, Aug 12, 2013 at 4:08 PM, Victor Efimov via RT <
> > perlbug-followup@perl.org> wrote:
> > >
> > > sub try_drop_utf8_flag
> > > {
> > >   Encode::_utf8_off($_[0]) if utf8::is_utf8($_[0]) &&
> > > (bytes::length($_[0]) == length($_[0]));
> > > }
> > 
> > 
> > That's just C<< utf8::downgrade($_[0], 1) >>
> 
> Yes, you are right, except one small difference.
> For characters > 127, but <= 255 it works different way.
> Thus it cannot be used, when strings are filenames (like in example
> above, also another example below).
> 
> (That's btw exactly like I work with it in my program
> https://github.com/vsespb/mt-aws-glacier - read millions of filenames,
> split, try drop utf-8 flags, and process with regexps)
> 
> use bytes ();
> use utf8;
> binmode STDOUT, ":encoding(utf-8)";
> use Devel::Peek;
> sub try_drop_utf8_flag
> {
>   Encode::_utf8_off($_[0]) if utf8::is_utf8($_[0]) &&
> (bytes::length($_[0]) == length($_[0]));
> }
> sub do_downgrade
> {
>   utf8::downgrade($_[0], 1)
> }
> my $s = "�";
> my $s1 = $s;
> try_drop_utf8_flag($s1);
> my $s2 = $s;
> do_downgrade($s2);
> Dump($s1);
> Dump($s2);
> 
> 
> die unless $s1 eq $s2;
> 
> open my $f, ">", "$s1.tmp";
> binmode $f;
> syswrite $f, "TEST";
> close $f;
> 
> open $f, "<", "$s2.tmp" or die "file not found $!";
> 
> 
> __END__
> SV = PVMG(0xfc00a0) at 0xfc1440
>   REFCNT = 1
>   FLAGS = (PADMY,SMG,POK,pPOK,UTF8)
>   IV = 0
>   NV = 0
>   PV = 0x1042b90 "\303\272"\0 [UTF8 "\x{fa}"]
>   CUR = 2
>   LEN = 8
>   MAGIC = 0x1094090
>     MG_VIRTUAL = &PL_vtbl_utf8
>     MG_TYPE = PERL_MAGIC_utf8(w)
>     MG_LEN = 1
> SV = PV(0xfd6538) at 0xfc1488
>   REFCNT = 1
>   FLAGS = (PADMY,POK,pPOK)
>   PV = 0xfdccd0 "\372"\0
>   CUR = 1
>   LEN = 8
> file not found No such file or directory at bench3-poc.pl line 29.




---
via perlbug:  queue: perl5 status: open
https://rt.perl.org:443/rt3/Ticket/Display.html?id=117355

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About