On Tue, Feb 06, 2007 at 06:27:59PM +0100, Gerard Goossen wrote: > > This is not a matter of context, by the way. Instead, the value "\xFF" > > is polymorphic. It's both a unicode string representing code point > > U+00FF, and the single byte 0xFF. > No. \xFF creates a character represented by FF according to the native > encoding. > If your native encoding is EBCDIC this does NOT correspend to > U+00FF (instead it corresponds to U+007E or U+009F, depending on the > flavor of EBCDIC you're on). > You also assume that \xFF in the native encoding corresponds to a byte > You assume (like everybody else) that in the native encoding a > character corresponds to a byte with the same numeric value. > This assumption is what makes the transition to UTF-8 so difficult, > because in the UTF-8 encoding, the assumption is NOT correct. I think are saying that UTF-EBCDIC should be the internal representation for strings in Perl on EBCDIC platforms if any characters in the string has a value >= 0x80. If this is what you are saying, then I can see why I, and other people cannot understand you. We're not on the same page. I don't believe UTF-EBCDIC makes sense, as UTF-EBCDIC is not an encoding of UNICODE. It is an encoding of a mix between EBCDIC/UNICODE. Although UTF-8 is only an encoding scheme, most people assume that the internal representation for a language that claims to support UNICODE, should be UNICODE, therefore the UTF-8 should be encoding UNICODE code points. Not EBCDIC/UNICODE code points. Perhaps this would represent a performance degradation for systems that use EBCDIC natively? Is this why you would focus on UTF-EBCDIC? Anyways - I've not shared people's opinions that Perl's implementation of UNICODE or UTF-8 is excellent. I've avoided it wherever possible. I prefer Java's approach or GTK's approach. Java uses UTF-16 internal representation, but never confuses internal representation with external representation. If portability is of course, this seems an excellent approach. Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/