develooper Front page | perl.perl5.porters | Postings from February 2008

Re: use encoding 'utf8' bug for Latin-1 range

Thread Previous | Thread Next
February 27, 2008 02:45
Re: use encoding 'utf8' bug for Latin-1 range
Message ID:
On 27/02/2008, Juerd Waalboer <> wrote:
> Glenn Linderman skribis 2008-02-27  1:50 (-0800):
>  > * Deprecate "use encoding".
>  Agreed.
>  > * Deprecate non-ASCII characters in Perl 5.12 source code unless a
>  > source encoding is specified.  Make UTF-8, rather than ASCII, the
>  > default source encoding for Perl 5.14.
> I wouldn't object, but would prefer to see Perl 5.12 already interpret
>  source code as UTF-8 if it happens to indeed be valid UTF-8. A silent or
>  warning fallback to latin1 could be used for backwards compatibility.

It does, except it does not use a heuristic to determine if its valid
utf8 (which is the only way to tell), it looks for BOM markers. Set up
your editor to put BOM markers on your source files and they will
automatically be interpreted as being in utf8. (assuming you put uft8
bom markers on, if you put utf16 markers it will be treated as utf16)

Frankly im against using heuristics to determine encoding. Better to
just tell people to use editors that ensure the correct BOM is
prepended to the file.


perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About