develooper Front page | perl.perl5.porters | Postings from February 2001

Re: The State of The Unicode

Thread Previous | Thread Next
From:
andrew
Date:
February 19, 2001 21:21
Subject:
Re: The State of The Unicode
Message ID:
20010220002111.S17705@pimlott.ne.mediaone.net
Sorry for answering this very basic mail last--it ended up at the
bottom in thread view.

On Mon, Feb 19, 2001 at 07:01:29PM -0600, Jarkko Hietaniemi wrote:
> I drafted the big \xHH vs \x{HH} vs chr vs v vs .. table, and (I
> thought) all I talked about was the internal representation, how
> many bytes are generated where.

You can see that I (with an honest reading of that message, and the
ones preceeding it) didn't understand it this way.  On one hand,
this probably points to the need for clarification.  On the other,
sorry for any confusion.

> There were *internal* inconsistencies and I straightened them out.
> I do agree that _for_the_normal_user_ whether chr(128) is
> internally one or two bytes long shouldn't matter: and guess what,
> currently chr(128) eq "\x80" && chr(128) eq "\x{80}" && chr(128)
> eq v128 && chr(128) eq pack("C", 128).  So what's the problem we
> are trying to solve here?

We are trying to answer, what gets printed when I output these guys
to a UTF8 handle?  What are the results of isalpha, toupper, and the
Unicode property regexp patterns?  In short, we are trying to
identify which character the darn thing is.

Andrew

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About