develooper Front page | perl.perl5.porters | Postings from June 2012

Re: [perl #113088] Data::Dumper::Useqq('utf8') broken [PATCH]

Thread Previous | Thread Next
From:
Aristotle Pagaltzis
Date:
June 7, 2012 08:33
Subject:
Re: [perl #113088] Data::Dumper::Useqq('utf8') broken [PATCH]
Message ID:
20120607153306.GA17341@fernweh
* Jim Avera <perlbug-followup@perl.org> [2012-05-26 03:10]:
> However it seems wrong to test for #chars != #bytes, because binary
> data _should_ be passed as byte strings, that is, with Perl's internal
> utf8 flag off.

Disagree.

The UTF8 flag is completely irrelevant to a string’s semantics. Wherever
it’s treated as meaningful, that is a bug that should be fixed. So it
seems to me at first sight that the string should just reach the fast
exit check untouched and be left for the remaining code to deal with.

But on closer read I get a vague impression that the intent of the code
in the whole function is based on confused notions about encodings. And
that it therefore possibly should be done over entirely. I am not yet
sure exactly what it is trying to achieve, though.


As an irrelevant aside,

>    s/([^\x00-\x7f])/'\x{'.sprintf("%x",ord($1)).'}'/ge if $bytes > length;

… it’s a mystery to me why the replacement expression was spelled

    '\x{'.sprintf('%x',...).'}'

instead of simply

    sprintf('\x{%x}',...)

and similarly for several other substitutions within the function.

Regards,
-- 
Aristotle Pagaltzis // <http://plasmasturm.org/>

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About