Glenn Linderman skribis 2008-05-20 17:16 (-0700): > >Instead I suggest the following two categories: > >1. Single byte encodings: every character is a single byte. By > >necessity, only a small subset of Unicode is supported. > >2. Multibyte encodings: every character is a one or more bytes. > >2a. Legacy: Only a subset of Unicode is supported. > >2b. Unicode: The whole Unicode set is supported. > >2c. Full: A larger range than Unicode is supported. > >An encoding may or may not be ASCII-compatible. > There is only one "ASCII-compatible" encoding: ASCII itself. Other > things are Extended ASCII, which is only somewhat compatible with > 7-bit ASCII, not 8-bit ASCII. This is a fine point, but I think you > can accept the term "Extended ASCII" here? No, extended ASCII is a wildly confusing term, that many will associate with IBM codepages. Also, 8 bit ASCII does not exist. Latin1 is ASCII compatible in that every single byte that's possible in ASCII, has the same meaning in latin1. Same goes for utf8, but not for utf16. Although it's actually the other way around (ASCII is latin1 compatible), I think this is a non-confusing description. -- Met vriendelijke groet, Kind regards, Korajn salutojn, Juerd Waalboer: Perl hacker <#####@juerd.nl> <http://juerd.nl/sig> Convolution: ICT solutions and consultancy <sales@convolution.nl> 1;