develooper Front page | perl.perl5.porters | Postings from December 2008

Re: ? Should \N{LATIN CAPITAL LETTER A} have the same semantics as \N{U+0041}

Thread Previous | Thread Next
December 29, 2008 15:30
Re: ? Should \N{LATIN CAPITAL LETTER A} have the same semantics as \N{U+0041}
Message ID:
2008/12/29 karl williamson <>:
> They both mean 'A', but they have somewhat different semantics.
> Currently the LATIN CAPITAL LETTER A turns on Unicode semantics for the
> entire string or regular expression it is in, while the U+0041 does not.
> I contend that since they both mean the same Unicode code point, that they
> should have identical semantics, either both turning it on or both not.
> And I believe they both should turn it on, as the use of these constants
> implies that the program is thinking in Unicode, and so expects Unicode
> semantics.
> Is there any disagreement?

Hmm. Well i think its sensible on an abstract level to make U+ escapes
always enable unicode. But the main reason that we dont is that
unicode is slower than non-unicode in the regex engine, and that we
try to stay in non-unicode as much as possible. Other than that point
tho i think you are right, and i think in the balance of things your
proposal is fine.


perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About