develooper Front page | perl.perl5.porters | Postings from June 2010

Re: fold case matching

Thread Previous | Thread Next
From:
karl williamson
Date:
June 3, 2010 14:25
Subject:
Re: fold case matching
Message ID:
4C081DCD.8050203@khwilliamson.com
karl williamson wrote:
> Dave Mitchell wrote:
>> Just out of curiosity, which perl (if any) is doing the Right Thing
>> as regards the following code, which matches a char that case folds to 
>> two
>> chars:
>>
>>     # lc("\x{149}") => "\x{2bc}N"
>>
>>     print "ok PLAIN 1\n" if "\x{149}" =~ /\x{2bc}/i;
>>     print "ok PLAIN 2\n" if "\x{149}" =~ /N/i;
>>     print "ok PLAIN 3\n" if "\x{149}" =~ /\x{2bc}N/i;
>>
>>     print "ok ALT   1\n" if "\x{149}" =~ /\x{2bc}|ZZZZ/i;
>>     print "ok ALT   2\n" if "\x{149}" =~ /N|ZZZZ/i;
>>     print "ok ALT   3\n" if "\x{149}" =~ /\x{2bc}N|ZZZZ/i;
>>
>>
>> 5.8.0,
>> 5.13.0,
>> blead:
>>
>>     ok PLAIN 3
>>     ok ALT   1
>>     ok ALT   3
>>
>> 5.10.0,
>> 5.10.1,
>> 5.12.0:
>>
>>     ok PLAIN 1
>>     ok PLAIN 3
>>     ok ALT   1
>>     ok ALT   3
>>
>> (This is in the context me me trying to understand and fix the trie code
>> for [perl #74484] Regex causing exponential runtime+mem usage.)
>>
>>
>>
> 
> I looked at this a little bit more, enough to realize that I don't want 
> to learn this area of the code unless necessary at some later point. 
> Anyway, earlier I wrote that cases 3 were the only ones where it should 
> match.  And In blead, the problem that remains is tries, so that ALT 1 
> gets matched.
> 
> The problem lies in REXEC_TRIE_READ_CHAR or its callers.  They aren't 
> calling ibcmp_utf8, and so don't get the benefit of its patch that fixed 
> the ALT 1 case in 5.12.  Now I don't know if they should be calling 
> ibcmp_utf8, but the bottom line is that they should somehow guarantee 
> that a partial character isn't matched.
> 
> Maybe it's best to wait for Yves' work on fold matching, but then I seem 
> to say a lot that it's the answer to all our problems.  After he 
> finishes it, he'll be ready to tackle world hunger, and other trifling 
> issues :)
> 
I meant to say, rather, other comparatively easy issues.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About