develooper Front page | perl.perl5.porters | Postings from February 2008

Re: use encoding 'utf8' bug for Latin-1 range

Thread Previous | Thread Next
February 28, 2008 14:38
Re: use encoding 'utf8' bug for Latin-1 range
Message ID:
On Thursday 28 February 2008 22:53:29 Glenn Linderman wrote:
> On approximately 2/28/2008 1:12 PM, came the following characters
> from
> the keyboard of Nicholas Clark:
> > On Thu, Feb 28, 2008 at 08:34:12PM +0100, Tels wrote:
> >> In any event, I don't see why "use utf-8" shouldn't die when the
> >> source contains non-utf-8. After all, you just told Perl it does
> >> ;)
> >
> > I would have liked it if it did. But it already seems that we have
> > it the wrong way, and I'd prefer to deprecate the wrongness, than
> > change it again.
> >
> > Nicholas Clark
> I think Tels made a typo... but what he said "use utf-8;" currently
> produces "Can't locate"  Of course, I doubt he meant to
> pass negative 8 as a parameter to the module...

Yes, it was a typo, but your idea is interesting.

But still, the idea that a "heuristic" decides wether "c3b6" is UTF-8 or 
ISO-8859-1 bothers me. It can't decide which is which, because they are 
both valid. This will only lead to subtle problems like adding one byte 
to your source making it suddenly interpreted in a different encoding.

Why not just let the programmer specifiy which encoding he meant? Do we 
really need to plaster around the lazyness of some people and thus 
create more problems than we solve?

All the best,


 Signed on Thu Feb 28 23:35:49 2008 with key 0x93B84C15.
 View my photo gallery:
 PGP key on or per email.

 Morton's Law: If rats are experimented upon, they will develop cancer.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About