Marc Lehmann skribis 2007-03-31 2:12 (+0200): > Yes, and the exact same is true for unicode (both have a 1-1 mapping > between 0..255 and octets), trivially, of course, as unicode explicitly is > a superset of latin1. Unicode is a character set, not a character encoding. While for 8 bit character sets, the encoding is the same thing, once you get past the 8 bit boundary, the difference begins to matter. A unicode string is a sequence of codepoints, not octets. They don't map 1:1 to octets either. To express a unicode string in octects, you need to encode it. For this, there are several possibilities, including UTF-8, UTF-16, ... Unicode is a superset of the latin1 character set, not the latin1 character encoding. We'd need bigger bytes for the latter :) -- korajn salutojn, juerd waalboer: perl hacker <juerd@juerd.nl> <http://juerd.nl/sig> convolution: ict solutions and consultancy <sales@convolution.nl> Ik vertrouw stemcomputers niet. Zie <http://www.wijvertrouwenstemcomputersniet.nl/>.Thread Previous | Thread Next