-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Moin, On Friday 30 March 2007 21:00:37 Marvin Humphrey wrote: > On Mar 30, 2007, at 12:53 PM, Juerd Waalboer wrote: > > Perl does not have strong typing. > > If it is so deadly to collide byte-oriented data with character data, > it should not be so easy to do so accidentally. It can happen everytime you concatenate two strings. Maybe we could add a new warning? use warnings 'upgrade'; my $a = 'a'; $a .= "\x100"; # warns In an application I am currently bringing up to speed in regard to Unicode I opted for a "string" struct, that contains essentially: * the lenght in bytes * the lenght in characters (not always set, e.g. can be unknown) * the storage buffer (containing the data, plus some optional padding) * the encoding Every action between two stings thus becomes very clearly defined as you can compare their encodings before doing anything. (for instance upgrading one or both strings before comparing them etc.) In Perl, you have only one bit to tell you the encoding (utf8), and it seems this is not enough as strings without that bit set can be either ASCII, or ISO-8859-1, or the local locale (maybe?), or utf-8 which hasn't yet tagged as UTF-8 etc. In short, it becomes a mess. All the best, Tels - -- Signed on Fri Mar 30 23:11:40 2007 with key 0x93B84C15. View my photo gallery: http://bloodgate.com/photos PGP key on http://bloodgate.com/tels.asc or per email. "Call me Justin, Justin Case." -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) iQEVAwUBRg2ag3cLPEOTuEwVAQKxjwf/Tu2blhDuAawXoTbNOCA9wBnWtvxvwL05 PoIZOI9vSivXF78ooL8/Hta8pC4o2/TgFdYzORyzNGCGNSdkkj/4vnriZ+f67uV2 BQGhzceu7r5U2Byl1xBS/egDB8FOSzB9kX3BcviD+ePjB/gAys0XagCQxfzLiFEa mCAp3LVVANmXei0/AgoI/Mj2gO+iz4XX3QvqoL/4tr7Dg734pG/SkYvNE5DL2sc0 OfTvQPGc8NmLHseEM8Vt0jY/gApHLK0LFn9yh98BbJaGNIaCzNZxtPABGYWjFoFS JI1qEVVO4xu0FOJktdEaOSdONTGBincL+4jZ4HbXpi7EMCCZJNLLyw== =t2+L -----END PGP SIGNATURE-----Thread Previous | Thread Next