develooper Front page | perl.perl5.porters | Postings from November 2010

Re: ? RFC: Should a utf8 regex pattern with the /d modifier haveunicodesemantics?

Thread Previous | Thread Next
From:
karl williamson
Date:
November 29, 2010 12:07
Subject:
Re: ? RFC: Should a utf8 regex pattern with the /d modifier haveunicodesemantics?
Message ID:
4CF407D6.5060603@khwilliamson.com
Abigail wrote:
> On Mon, Nov 29, 2010 at 11:22:20AM -0700, karl williamson wrote:
>> Dr.Ruud wrote:
>>> On 2010-11-29 03:18, karl williamson wrote:
>>>
>>>> I think that a utf8 regex pattern with the /d modifier should have
>>>> unicode semantics. But I'm open to contrary opinions.
>>>>
>>>> I had always assumed that it should, but it doesn't currently.
>>> How to define "a utf8 regex pattern"?
>>>
>>> Will /\d$re/ be a "utf8 regex pattern" if $re has the utf8-flag on, 
>>> even if it only contains code points <= 255?
>>>
>> Yes.
> 
> 
> So, this will be fixed?
> 
>     my $e  = "\xE8";
>     my $re = "\N{WHITE SMILING FACE}";
> 
>     say $e =~ /[\w$re]/ ? "Match" : "No match";
>     say $e =~ /\w|$re/  ? "Match" : "No match";
> 
>     __END__
>     Match
>     No match
> 
> 
Well, in my undelivered code space they both match.  I don't understand 
what the Match/No match is after the __END__.  If you think they should 
return different values, please explain.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About