develooper Front page | perl.perl5.porters | Postings from April 2008

[perl #41664] unicode and case insensitive regex

From:
Bram via RT
Date:
April 30, 2008 07:40
Subject:
[perl #41664] unicode and case insensitive regex
Message ID:
rt-3.6.HEAD-5488-1209557967-1194.41664-15-0@perl.org
On Fri Mar 02 13:22:01 2007, perl-5.8.0@ton.iguana.be wrote:
> 
> Tested in 5.8.4 and 5.8.8:
> 
> perl -wle 'utf8::upgrade(my $auml = "\xe4"); print "\xe4" =~ /$auml/
 ?
>    "yes" : "no"'
> yes
> This is as expected
> 
> perl -wle 'utf8::upgrade(my $auml = "\xe4"); print "\xe4" =~ /$auml/i
>    ? "yes" : "no"'
> no
> Then case insensitive should match too
> 
> perl -wle 'utf8::upgrade(my $auml = "\xe4"); print $auml =~ /$auml/i
 ?
>    "yes" : "no"'
> yes
> Self match works
> 
> perl -wle 'use locale; utf8::upgrade(my $auml = "\xe4"); print $auml
>    =~ /$auml/i ? "yes" : "no"'
> no
> Oops, we lost it again
> 
> [Please do not change anything below this line]
> -----------------------------------------------------------------


Testing with blead:

export LC_ALL=en_US.utf8 
./perl -Mre=debug -wle '
use locale;
utf8::upgrade(my $auml = "\x{e4}");
print $auml =~ /$auml/i ? "yes" : "no"'

Compiling REx "%344"
Final program:
   1: EXACTFL <\344> (3)
   3: END (0)
stclass EXACTFL <\344> minlen 1 
Matching REx "%344" against "%344"
UTF-8 pattern and string...
Matching stclass EXACTFL <\344> against "%344" (2 chars)
Contradicts stclass... [regexec_flags]
Match failed
no
Freeing REx: "%344"


Putting anchros in the re:
./perl -Mre=debug -wle '
use locale;
utf8::upgrade(my $auml = "\x{e4}");
print $auml =~ /^$auml$/i ? "yes" : "no"'

Matching REx "^%344$" against "%344"
UTF-8 pattern and string...
   0 <> <%344>               |  1:BOL(2)
   0 <> <%344>               |  2:EXACTFL <\344>(4)
   2 <%344> <>               |  4:EOL(5)
   2 <%344> <>               |  5:END(0)
Match successful!
yes
Freeing REx: "^%344$"


Normal/Expected behaviour?

Kind regards,

Bram



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About