develooper Front page | perl.perl5.porters | Postings from February 2001

Re: The State of The Unicode

Thread Previous | Thread Next
From:
Jarkko Hietaniemi
Date:
February 19, 2001 15:19
Subject:
Re: The State of The Unicode
Message ID:
20010219171930.D15351@chaos.wustl.edu
On Mon, Feb 19, 2001 at 06:07:14PM -0500, Andrew Pimlott wrote:
> Thank you for your prompt reply--you did read the whole thing,
> right?  ;-)

Yes, though I didn't ponder every detail.

> On Mon, Feb 19, 2001 at 04:47:53PM -0600, Jarkko Hietaniemi wrote:
> > (1) The current model, both externally and internally,
> >     follows what is described by the Camel Mk3.
> 
> Camel III has zero complete examples of Unicode support (unless
> there are examples outside of the Unicode section, which I have not
> read).  The Unicode chapter is a scant nine pages.  There is nothing
> there to violate.

There are rules like "old non-Unicode-aware programs doing byte
things shall not break".

> I agree that I have seen no examples as far as pure string
> manipulation.  But the relationship between strings and numbers must

Just manipulate them.  As people seem lately to be eager to chant:
"transparent" :-)

> > Combine (1) and (2) and I see it as "what is broken, so what's there to
> > fix" situation, let's call it (3).
> > 
> > As far "what is broken", I do understand the concern of "exposing too
> > much of the internal representation" (which at the moment happens to
> > be UTF-8) to the user, having bytes and character is confusing at
> > best.  However, I'm not fully convinced that completely hiding it is
> > wise, either.  If from Perl level one cannot reach back to the bytes
> > comprising the UTF-8 representation of the characters, I feel we are
> > trying to pad the cell too softly.
> 
> My kingdom for one example.

You want to create a prototype of Unicode composing and decomposing
algorithm in Perl, or you want to write a SCSU (Unicode compression
algorithm) algorithm in Perl.  You want to convert UTF-8 into UTF-16.
Anywhere where you want to get into the guts of the encoding(s).

-- 
$jhi++; # http://www.iki.fi/jhi/
        # There is this special biologist word we use for 'stable'.
        # It is 'dead'. -- Jack Cohen

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About