* Jim Avera <perlbug-followup@perl.org> [2012-05-26 03:10]: > However it seems wrong to test for #chars != #bytes, because binary > data _should_ be passed as byte strings, that is, with Perl's internal > utf8 flag off. Disagree. The UTF8 flag is completely irrelevant to a string’s semantics. Wherever it’s treated as meaningful, that is a bug that should be fixed. So it seems to me at first sight that the string should just reach the fast exit check untouched and be left for the remaining code to deal with. But on closer read I get a vague impression that the intent of the code in the whole function is based on confused notions about encodings. And that it therefore possibly should be done over entirely. I am not yet sure exactly what it is trying to achieve, though. As an irrelevant aside, > s/([^\x00-\x7f])/'\x{'.sprintf("%x",ord($1)).'}'/ge if $bytes > length; … it’s a mystery to me why the replacement expression was spelled '\x{'.sprintf('%x',...).'}' instead of simply sprintf('\x{%x}',...) and similarly for several other substitutions within the function. Regards, -- Aristotle Pagaltzis // <http://plasmasturm.org/>Thread Previous | Thread Next