2008/12/29 karl williamson <public@khwilliamson.com>: > They both mean 'A', but they have somewhat different semantics. > > Currently the LATIN CAPITAL LETTER A turns on Unicode semantics for the > entire string or regular expression it is in, while the U+0041 does not. > > I contend that since they both mean the same Unicode code point, that they > should have identical semantics, either both turning it on or both not. > > And I believe they both should turn it on, as the use of these constants > implies that the program is thinking in Unicode, and so expects Unicode > semantics. > > Is there any disagreement? Hmm. Well i think its sensible on an abstract level to make U+ escapes always enable unicode. But the main reason that we dont is that unicode is slower than non-unicode in the regex engine, and that we try to stay in non-unicode as much as possible. Other than that point tho i think you are right, and i think in the balance of things your proposal is fine. Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"Thread Previous | Thread Next