develooper Front page | perl.perl5.porters | Postings from July 2017

Re: [perl #131685] Rename utf8::is_utf8() (and other functions)

Thread Previous | Thread Next
From:
pali
Date:
July 4, 2017 11:14
Subject:
Re: [perl #131685] Rename utf8::is_utf8() (and other functions)
Message ID:
20170704111404.GA16375@pali
On Tuesday 04 July 2017 03:12:19 yves orton via RT wrote:
> On 4 July 2017 at 12:04,  <pali@cpan.org> wrote:
> > On Tuesday 04 July 2017 11:22:42 demerphq wrote:
> >> No. This is a myth. Plain and simply a myth.
> >>
> >> People have a hard time accepting it, but the utf8 flag tells parts of
> >> the internals to use different rules for certain operations, when set
> >> those rules are Unicode. When the flag is not set the default rules
> >> are derived from ASCII.
> >>
> >> You can see the difference in the following:
> >>
> >> "ba\x{DF}"=~/ss/i;
> >
> > $ perl -E 'say "matched" if "ba\x{DF}"=~/ss/i;'
> > matched
> >
> >> "ba\N{U+DF}"=~/ss/i;
> >
> > $ perl -E 'say "matched" if "ba\N{U+DF}"=~/ss/i;'
> > matched
> 
> -E is not -e.
> 
> -E is enabling a pragma which changes the default behavior.
> 
> However it is *PRAGMA*. It is NOT the normal behavior of Perl.

Ah, right. I forgot that -E enables feature unicode_strings which
basically means that both examples were equivalent.

Default behavior is a bit unpredicable as it is affected by the
infamous Unicode Bug.

my $str1 = "\x{DF}";
my $str2 = "\N{U+DF}";
my $str3 = "\x{100}";

"ba$str1" =~ /ss/i;
"ba$str2" =~ /ss/i;

"ba$str1$str3" =~ /ss/i;

To make it predicable either /aa or /u modifiers should be already
used... It will prevent problems

"ba$str1" =~ /ss/aai;
"ba$str2" =~ /ss/aai;
"ba$str1$str3" =~ /ss/aai;

"ba$str1" =~ /ss/ui;
"ba$str2" =~ /ss/ui;
"ba$str1$str3" =~ /ss/ui;

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About