develooper Front page | perl.perl5.porters | Postings from February 2008

Re: use encoding 'utf8' bug for Latin-1 range

Thread Previous | Thread Next
From:
Jarkko Hietaniemi
Date:
February 25, 2008 17:02
Subject:
Re: use encoding 'utf8' bug for Latin-1 range
Message ID:
47C36523.4040703@iki.fi
Juerd Waalboer <juerd@convolution.nl> sanan virkkoi, noin nimesi:

: encoding.pm has a broken design,

On that I agree, being the designer.  I tried to cram too much
functionality into the defaults, plus of course planted bugs.

: and for that reason, any fix will
: probably break almost all existing code using it.
:
: Unfortunately, it applies \x escapes 00..ff before it decodes the source.
: This means that for 8bit encodings, you can only use characters in the
: latin1 range if the same character happens to be in the 0..255 range for
: your chosen encoding. E.g. with "use encoding 'koi8r';" it is no longer
: possible to have a literal � (U+00e9, eacute), not even with chr().
:
: Because there are other problems with encoding.pm, that can also not be
: fixed without breaking backward compatibility, I suggest the following
simple 4 step plan for the future, that is backwards compatible:
:
: 0. keep encoding.pm and ${^ENCODING} (the actual problem) broken
: 1. deprecate encoding.pm; complain loudly with a mandatory warning
: 2. do the same for ${^ENCODING}
: 3. advocate the use of utf8 and "use utf8" for non-latin1 source code
: 4. strongly discourage the use of non-latin1 non-utf8 source code
: 5. modify open.pm to provide a way to set *only* STDIN and STDOUT

On the point 4. I can but disagree.  While I do strongly believe that
Unicode is a good thing, and UTF-8 is the only sane source code encoding
of it, I also strongly think that dictating its use is Not The Perl Way.
Unless Rule 1 says that UTF-8 is the Only Way, we should allow for
having code (remember, in Perl this includes also things like POD
and here-docs) in legacy encodings.  The only reason I implemented
(poorly) the encoding pragma was to support legacy encodings, especially
the Eastern Asian ones.




Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About