On approximately 12/29/2008 10:54 AM, came the following characters from
the keyboard of karl williamson:
> They both mean 'A', but they have somewhat different semantics.
>
> Currently the LATIN CAPITAL LETTER A turns on Unicode semantics for the
> entire string or regular expression it is in, while the U+0041 does not.
>
> I contend that since they both mean the same Unicode code point, that
> they should have identical semantics, either both turning it on or both
> not.
>
> And I believe they both should turn it on, as the use of these constants
> implies that the program is thinking in Unicode, and so expects Unicode
> semantics.
>
> Is there any disagreement?
Sounds correct and appropriate to me... at least under the new pragma.
The backward compatibility police might want to declare that \N{U+0041},
since it didn't used to turn on Unicode, still shouldn't. I would
consider that not turning it on was a bug, and that if not turning it on
was desired, that the programmer should have used any of: "A", 'A',
"\x41", "\x{41}", or "\101".
--
Glenn -- http://nevcal.com/
===========================
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking
Thread Previous