-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Moin, On Saturday 31 March 2007 00:29:42 Marc Lehmann wrote: > On Sat, Mar 31, 2007 at 02:16:49AM +0200, Juerd Waalboer <juerd@convolution.nl> wrote: > > Marc Lehmann skribis 2007-03-31 2:12 (+0200): > > > Yes, and the exact same is true for unicode (both have a 1-1 mapping > > > between 0..255 and octets), trivially, of course, as unicode > > > explicitly is a superset of latin1. > > > > Unicode is a character set, not a character encoding. > > As is latin1. > > > A unicode string is a sequence of codepoints, not octets. > > Nope. You can encode unicode codepoints into UTF-8 and still end up with > a unicode string. Encoding doesn't change the fact that it is unicode > that your are storing. > > Since it seems hard to grasp, here is an example: > > my $s = "Hello, World!"; > $s = Encode::encode_utf8 $s; > > $s contains the famous greeting before and after the encoding. It is > still an ASCII string, iso-8859-15 string, and a unicode string, and a > text string, regardless of wether it is encoded or not, that does not > change the fact that that string contaisn the message "Hello, World!". > > If you drop ASCII, the same is true for "Hallöchen!", which looks > differently in UTF-8 then in an unencoded string, but it is still the > same message. And it is till using unicode to represent the characters. > > The fact that you encode something does not change the something that you > encode. Making an arbitrary difference only confuses the issue. Especially since Perl itself doesn't have any way to distinguish "a" (UNKNOWN ENCODING) from "a" (ASCII) from "a" (ISI-8859-1) from "a" (UTF-8) - except one bit :) All the best, Tels - -- Signed on Sat Mar 31 12:24:31 2007 with key 0x93B84C15. Get one of my photo posters: http://bloodgate.com/posters PGP key on http://bloodgate.com/tels.asc or per email. "Most people, I think, don't even know what a rootkit is, so why should they care about it?" -- Thomas Hesse, President of Sony BMG's global digital business division, 2005. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) iQEUAwUBRg5TjXcLPEOTuEwVAQIrGAf417/05df4c3hIzTnFoidS3fAKWPHm9Ots 5BNa8n3PJci4cGQ2Sz7LzRf4BjD6+seW8Zq6fKNMIlCpmwCJYh/M+Ol8BBGefjhU tJxebJs1O2K+ZEd9cJTP/PP2bnqg9Z1CwiBNn8xT/cT8tbF6rR9kujaHooSkHnPV snDog7uLrk117tof8ORcybml0bDfhWzh4UfYOyue37RyrqAWnIXNOu24uYUjMiDT US3vym0LX+LUO4aBS9Ur/tX6FSBX/5mXDn0fPR016ESbzWA6TMMurSIjWYLFTw9R rRK0KSAb/z93Z6ZhHvyaKOz8Tt9ma44adu6WgTXrK5dcrpih8xbX =Q94f -----END PGP SIGNATURE-----Thread Previous | Thread Next