develooper Front page | perl.perl5.porters | Postings from April 2011

Re: Proposed update for 5.14 for perlunicode.pod

Thread Previous | Thread Next
From:
Karl Williamson
Date:
April 12, 2011 20:41
Subject:
Re: Proposed update for 5.14 for perlunicode.pod
Message ID:
4DA51AF3.6080509@khwilliamson.com
On 04/12/2011 08:47 PM, Tom Christiansen wrote:
> I erroneously wrote:
>
>> Does that mean that Perl will do the right thing if I simply say
>
>>     use locale;
>
>> I don't think it will.
>
> I was wrong, but there is still something confusing me.
>
> This shows that use locale has a built-in setlocale:
>
>      % echo $PERL_UNICODE $LANG
>      S en_US.UTF-8
>
>      % blead -CS -Mlocale -le 'print "\u\xE9"'
>      É
>      % blead -CS -M-locale -le 'print "\u\xE9"'
>      é
>      % blead -CS -le 'print "\u\xE9"'
>      é
>      % blead -CS -lE 'print "\u\xE9"'
>      É
>

I've always believed the documentation that says it doesn't do that, and 
on my machine the first line prints the lower case, but that could be a 
problem on my Linux box.  I think you have the same problem, that Darwin 
works and Linux doesn't.  Is that right?

> But this shows that /u regexes don't work like I would
> think they would:
>
>      % blead -le 'print "\xE9" =~ s/(.)/\u$1/r'
>      é
>      % blead -Mlocale -le 'print "\xE9" =~ s/(.)/\u$1/r'
>      É
>
> But:
>
>      % blead -le 'print "\xE9" =~ s/(\w)/\u$1/lr'
>      é
>      % blead -le 'print "\xE9" =~ s/(.)/\u$1/ru'
>      é
>
> Drat.  It isn't using Unicode case mapping when you use /u.
> Is that expected?  So /u *isn't* like an automatic
> use feature unicode_strings any moreso than /l is (not) a
> an automatic use locale?
>
> I wonder why I keep thinking they are. :(

The legalistic answer is that the regex modifier affects only pattern 
matching.  It does not apply to the substitution part.  But the truth of 
the matter is that I never thought about it during the implementation. 
You could file a bug report.  I don't know enough about the areas of 
Perl involved to know how easy it would be to implement.  Here's a case 
where use locale, and unicode_strings work differently than /l or /u, 
because they apply to more than pattern matching.  I think the docs 
should be changed to mention this issue, unless we block 5.14 and fix this.



Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About