develooper Front page | perl.perl5.porters | Postings from December 2010

Re: [perl #80030] Matching upper ASCII characters from file in RE patterns

Thread Previous | Thread Next
From:
SADAHIRO Tomoyuki
Date:
December 12, 2010 06:14
Subject:
Re: [perl #80030] Matching upper ASCII characters from file in RE patterns
Message ID:
20101212231312.AE6F.CB027F2D@nifty.com

On Sat, 11 Dec 2010 14:43:16 +0100
demerphq <demerphq@gmail.com> wrote:
...
> Also, and much worse is that at least up until 5.10 this insane
> remapping of codepoints also affected: \N{U+$codepoint} syntax.
> 
> Fixed sometime since then as its not in blead, but i havent checked
> when or if it is fixed in 5.12.
> 
> $ perl -v && perl -le'use encoding "iso 8859-7"; $a = "\xDF";
> $b="\N{U+DF}"; printf "0x%04x\n", ord for $a,$b'
...

This fix seems happen between 5.11.4 and 5.11.5 by
    PATCH: [perl #56444] delayed interpolation of \N{...}
   http://perl5.git.perl.org/perl.git/commit/ff3f963aa0f95ea53996b6a3842b824504b57c79

which makes \N{U+XX} syntax always have Unicode semantics
and prevents the block (in toke.c:S_scan_const()):
    if (PL_encoding && !has_utf8) {
        sv_recode_to_utf8(sv, PL_encoding);
from recoding \xDF to \x{3AF} under "use encoding 'iso 8859-7'".

Perhaps the Encode maintainer also wouldn't consider
whether "\N{U+DF}" should be equivalent to "\xDF".


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About