develooper Front page | perl.perl5.porters | Postings from February 2008

Re: use encoding 'utf8' bug for Latin-1 range

Thread Previous | Thread Next
Jarkko Hietaniemi
February 25, 2008 17:02
Re: use encoding 'utf8' bug for Latin-1 range
Message ID:
Juerd Waalboer <> sanan virkkoi, noin nimesi:

: has a broken design,

On that I agree, being the designer.  I tried to cram too much
functionality into the defaults, plus of course planted bugs.

: and for that reason, any fix will
: probably break almost all existing code using it.
: Unfortunately, it applies \x escapes 00..ff before it decodes the source.
: This means that for 8bit encodings, you can only use characters in the
: latin1 range if the same character happens to be in the 0..255 range for
: your chosen encoding. E.g. with "use encoding 'koi8r';" it is no longer
: possible to have a literal � (U+00e9, eacute), not even with chr().
: Because there are other problems with, that can also not be
: fixed without breaking backward compatibility, I suggest the following
simple 4 step plan for the future, that is backwards compatible:
: 0. keep and ${^ENCODING} (the actual problem) broken
: 1. deprecate; complain loudly with a mandatory warning
: 2. do the same for ${^ENCODING}
: 3. advocate the use of utf8 and "use utf8" for non-latin1 source code
: 4. strongly discourage the use of non-latin1 non-utf8 source code
: 5. modify to provide a way to set *only* STDIN and STDOUT

On the point 4. I can but disagree.  While I do strongly believe that
Unicode is a good thing, and UTF-8 is the only sane source code encoding
of it, I also strongly think that dictating its use is Not The Perl Way.
Unless Rule 1 says that UTF-8 is the Only Way, we should allow for
having code (remember, in Perl this includes also things like POD
and here-docs) in legacy encodings.  The only reason I implemented
(poorly) the encoding pragma was to support legacy encodings, especially
the Eastern Asian ones.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About