develooper Front page | perl.perl5.porters | Postings from December 2008

Re: ? Should \N{LATIN CAPITAL LETTER A} have the same semantics as \N{U+0041}

Thread Previous | Thread Next
From:
demerphq
Date:
December 29, 2008 15:30
Subject:
Re: ? Should \N{LATIN CAPITAL LETTER A} have the same semantics as \N{U+0041}
Message ID:
9b18b3110812291530g28645230x8f04a90ab29f78c4@mail.gmail.com
2008/12/29 karl williamson <public@khwilliamson.com>:
> They both mean 'A', but they have somewhat different semantics.
>
> Currently the LATIN CAPITAL LETTER A turns on Unicode semantics for the
> entire string or regular expression it is in, while the U+0041 does not.
>
> I contend that since they both mean the same Unicode code point, that they
> should have identical semantics, either both turning it on or both not.
>
> And I believe they both should turn it on, as the use of these constants
> implies that the program is thinking in Unicode, and so expects Unicode
> semantics.
>
> Is there any disagreement?

Hmm. Well i think its sensible on an abstract level to make U+ escapes
always enable unicode. But the main reason that we dont is that
unicode is slower than non-unicode in the regex engine, and that we
try to stay in non-unicode as much as possible. Other than that point
tho i think you are right, and i think in the balance of things your
proposal is fine.

Yves



-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About