develooper Front page | perl.perl5.porters | Postings from January 2010

Re: warding against bytes.pm

Thread Previous | Thread Next
From:
Marvin Humphrey
Date:
January 5, 2010 11:43
Subject:
Re: warding against bytes.pm
Message ID:
20100105194347.GA12599@rectangular.com
On Tue, Jan 05, 2010 at 06:56:42PM +0000, John wrote:

> PS This means we need to remove any talk of UTF-8 encoding for Perl 
> characters from all documentation except the Perl Guts docs.

I think this is idealistic.  In my opinion, it's not practical to use Perl's
Unicode tools without understanding the implementation down to the C
representation, including the SVf_UTF8 flag and how to use Devel::Peek to
snoop it.  Unicode glitches are just too hard to debug without such expertise.  

Getting UTF-8 into Perl scalars was an awesome hack, but the implementation is
prone to silent failure.  I don't think it's practical to harden the system
now, for the same backwards compatibility reasons that it wasn't done in the
first place.  So, our only defense will continue to be troubleshooting.

If you're going to purge leakage of the abstraction from the primary
documentation, I think it makes sense to arm users with a tutorial, named,
say, "perldebugunicode", that intentionally goes where the rest of the
documentation does not, covering the SVf_UTF8 flag and Devel::Peek::Dump() and
explaining how to diagnose and solve common problems.

Marvin Humphrey


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About