develooper Front page | perl.perl5.porters | Postings from February 2007

Re: Future Perl development

February 6, 2007 10:41
Re: Future Perl development
Message ID:
On Tue, Feb 06, 2007 at 06:27:59PM +0100, Gerard Goossen wrote:
> > This is not a matter of context, by the way. Instead, the value "\xFF"
> > is polymorphic. It's both a unicode string representing code point
> > U+00FF, and the single byte 0xFF.
> No. \xFF creates a character represented by FF according to the native
> encoding.
> If your native encoding is EBCDIC this does NOT correspend to
> U+00FF (instead it corresponds to U+007E or U+009F, depending on the
> flavor of EBCDIC you're on).
> You also assume that \xFF in the native encoding corresponds to a byte
> You assume (like everybody else) that in the native encoding a
> character corresponds to a byte with the same numeric value.
> This assumption is what makes the transition to UTF-8 so difficult,
> because in the UTF-8 encoding, the assumption is NOT correct. 

I think are saying that UTF-EBCDIC should be the internal representation
for strings in Perl on EBCDIC platforms if any characters in the string
has a value >= 0x80.

If this is what you are saying, then I can see why I, and other people
cannot understand you. We're not on the same page. I don't believe
UTF-EBCDIC makes sense, as UTF-EBCDIC is not an encoding of UNICODE.
It is an encoding of a mix between EBCDIC/UNICODE. Although UTF-8
is only an encoding scheme, most people assume that the internal
representation for a language that claims to support UNICODE, should
be UNICODE, therefore the UTF-8 should be encoding UNICODE code
points. Not EBCDIC/UNICODE code points.

Perhaps this would represent a performance degradation for systems
that use EBCDIC natively? Is this why you would focus on UTF-EBCDIC?

Anyways - I've not shared people's opinions that Perl's implementation
of UNICODE or UTF-8 is excellent. I've avoided it wherever possible.
I prefer Java's approach or GTK's approach. Java uses UTF-16 internal
representation, but never confuses internal representation with
external representation. If portability is of course, this seems
an excellent approach.


-- / /     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                  Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About