On Mon, Feb 19, 2001 at 06:07:14PM -0500, Andrew Pimlott wrote: > Thank you for your prompt reply--you did read the whole thing, > right? ;-) Yes, though I didn't ponder every detail. > On Mon, Feb 19, 2001 at 04:47:53PM -0600, Jarkko Hietaniemi wrote: > > (1) The current model, both externally and internally, > > follows what is described by the Camel Mk3. > > Camel III has zero complete examples of Unicode support (unless > there are examples outside of the Unicode section, which I have not > read). The Unicode chapter is a scant nine pages. There is nothing > there to violate. There are rules like "old non-Unicode-aware programs doing byte things shall not break". > I agree that I have seen no examples as far as pure string > manipulation. But the relationship between strings and numbers must Just manipulate them. As people seem lately to be eager to chant: "transparent" :-) > > Combine (1) and (2) and I see it as "what is broken, so what's there to > > fix" situation, let's call it (3). > > > > As far "what is broken", I do understand the concern of "exposing too > > much of the internal representation" (which at the moment happens to > > be UTF-8) to the user, having bytes and character is confusing at > > best. However, I'm not fully convinced that completely hiding it is > > wise, either. If from Perl level one cannot reach back to the bytes > > comprising the UTF-8 representation of the characters, I feel we are > > trying to pad the cell too softly. > > My kingdom for one example. You want to create a prototype of Unicode composing and decomposing algorithm in Perl, or you want to write a SCSU (Unicode compression algorithm) algorithm in Perl. You want to convert UTF-8 into UTF-16. Anywhere where you want to get into the guts of the encoding(s). -- $jhi++; # http://www.iki.fi/jhi/ # There is this special biologist word we use for 'stable'. # It is 'dead'. -- Jack Cohen