develooper Front page | perl.perl6.internals | Postings from June 2001

Re: Should we care much about this Unicode-ish criticism?

Thread Previous | Thread Next
From:
Dan Sugalski
Date:
June 5, 2001 15:30
Subject:
Re: Should we care much about this Unicode-ish criticism?
Message ID:
5.1.0.14.0.20010605182934.0224f1a0@24.8.96.48
At 03:21 PM 6/5/2001 -0700, Russ Allbery wrote:
>Dan Sugalski <dan@sidhe.org> writes:
> > At 12:40 PM 6/5/2001 -0700, Russ Allbery wrote:
>
> >> (As an aside, UTF-8 also is not an X-byte encoding; UTF-8 is a variable
> >> byte encoding, with each character taking up anywhere from one to six
> >> bytes in the encoded form depending on where in Unicode the character
> >> falls.)
>
> > Have they changed that again? Last I checked, UTF-8 was capped at 4
> > bytes, but that's in the Unicode 3.0 standard.
>
>Yes, it changed with Unicode 3.1 when they started allocating characters
>from higher planes.

Ah. I've the PDFs for 3.0. Time to go digging for references again, I see.

>Far and away the best reference for UTF-8 that I've found is RFC 2279.
>It's much more concise and readable than the version in the Unicode
>standard, and is more aimed at implementors and practical considerations.

It can't be much worse than what's provided in the Unicode 3.0 book--it has 
an absolutely abysmal description. (When you can even find the darned thing)

					Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
dan@sidhe.org                         have teddy bears and even
                                      teddy bears get drunk


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About