On Tue, Jan 05, 2010 at 06:56:42PM +0000, John wrote: > PS This means we need to remove any talk of UTF-8 encoding for Perl > characters from all documentation except the Perl Guts docs. I think this is idealistic. In my opinion, it's not practical to use Perl's Unicode tools without understanding the implementation down to the C representation, including the SVf_UTF8 flag and how to use Devel::Peek to snoop it. Unicode glitches are just too hard to debug without such expertise. Getting UTF-8 into Perl scalars was an awesome hack, but the implementation is prone to silent failure. I don't think it's practical to harden the system now, for the same backwards compatibility reasons that it wasn't done in the first place. So, our only defense will continue to be troubleshooting. If you're going to purge leakage of the abstraction from the primary documentation, I think it makes sense to arm users with a tutorial, named, say, "perldebugunicode", that intentionally goes where the rest of the documentation does not, covering the SVf_UTF8 flag and Devel::Peek::Dump() and explaining how to diagnose and solve common problems. Marvin HumphreyThread Previous | Thread Next