develooper Front page | perl.perl5.porters | Postings from November 2010

Re: ? RFC: Should a utf8 regex pattern with the /d modifier haveunicodesemantics?

Thread Previous | Thread Next
From:
Abigail
Date:
November 29, 2010 10:57
Subject:
Re: ? RFC: Should a utf8 regex pattern with the /d modifier haveunicodesemantics?
Message ID:
20101129185810.GF11337@almanda
On Mon, Nov 29, 2010 at 11:22:20AM -0700, karl williamson wrote:
> Dr.Ruud wrote:
>> On 2010-11-29 03:18, karl williamson wrote:
>>
>>> I think that a utf8 regex pattern with the /d modifier should have
>>> unicode semantics. But I'm open to contrary opinions.
>>>
>>> I had always assumed that it should, but it doesn't currently.
>>
>> How to define "a utf8 regex pattern"?
>>
>> Will /\d$re/ be a "utf8 regex pattern" if $re has the utf8-flag on, 
>> even if it only contains code points <= 255?
>>
>
> Yes.


So, this will be fixed?

    my $e  = "\xE8";
    my $re = "\N{WHITE SMILING FACE}";

    say $e =~ /[\w$re]/ ? "Match" : "No match";
    say $e =~ /\w|$re/  ? "Match" : "No match";

    __END__
    Match
    No match


In blead I get 'No match' twice, which, while consistent, surprises me.


Abigail

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About