develooper Front page | perl.perl5.porters | Postings from November 2008

Re: char16 datatype

Thread Previous | Thread Next
November 15, 2008 05:32
Re: char16 datatype
Message ID:
demerphq schreef:
> Dr.Ruud:
>> demerphq:

>>> Ultimately UTF-8 was a kludge,
>>> developed practically overnight to ensure that there would be a
>>> unicode representation that was unix legacy compatible, with the
>>> long term intention of replacing it with something better.
>> I would never put phrases like "kludge" and "intention to replace" on
>> UTF-8. Even if those were at the start (and I actually doubt they
>> ever were), that is all irrelevant now, because the kid has a life
>> of its own.
> See:
> "UCS provides the capability to encode multi-lingual text within a
> single coded character set.  However, UCS and its UTF variant do not
> protect null bytes and/or the ASCII slash ("/") making these character
> encodings incompatible with existing Unix implementations.  The
> following proposal provides a Unix compatible transformation format of
> UCS such that Unix systems can support multi-lingual text in a single
> encoding.  This transformation format encoding is intended to be used
> as a file code.  This transformation format encoding of UCS is
> intended as an intermediate step towards full UCS support.  However,
> since nearly all Unix implementations face the same obstacles in
> supporting UCS, this proposal is intended to provide a common and
> compatible encoding during this transition stage."
> Note the "transition stage" comment.

And that stage will just take forever. :)

I read no general 'one being superior over the other' in that piece of
text. The "is intended to be used as a file code" is like a first order
approximation, and it proved to be a good one. Usage of UTF-8 in memory
has its issues, certainly some good ones too!

Affijn, Ruud

"Gewoon is een tijger."

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About