develooper Front page | perl.perl5.porters | Postings from September 2016

Re: [perl #126310] no "Malformed UTF-8 character" warning onsingle-quoted strings under "use utf8"

Thread Previous | Thread Next
Karl Williamson
September 16, 2016 20:34
Re: [perl #126310] no "Malformed UTF-8 character" warning onsingle-quoted strings under "use utf8"
Message ID:
On 09/16/2016 06:46 AM, Florian Schlichting wrote:
> Hi Karl,
> Father Chrysostomos wrote:
>> On Wed Aug 31 20:35:02 2016, khw wrote:
>>> Is the attach3ed like what you mean?
>> Yes, that would work.
>> It would be nice, too, if we could add the `near such and such' that
>> yyerror normally does. Maybe yyerror could have an extra option to croak
>> instead of calling qerror. It already has a flags field.
> thanks for looking into this issue. I tested your patch and can confirm
> that it correctly treats single and double quotes the same:
> % ./perl -C0 -le 'print qq(print "\xB0C";)' | ./perl -I'lib' -Mutf8 -CS % -l
> Malformed utf8 at - line 1.
> % ./perl -C0 -le 'print qq(print \x27\xB0C\x27;)' | ./perl -I'lib' -Mutf8 -CS -l
> Malformed utf8 at - line 1.
> However, I feel a little uneasy about dying altogether. Currently Perl
> issues just a warning ("Malformed UTF-8 character") and that seems to be
> the approach with UTF-8 issues encountered in other places in toke.c as
> well. Most of the time, these will be strings displayed to the user, and
> they will mostly still be legible even with a few characters garbled or
> skipped. Don't you think "complain and carry on" is what users would
> expect?
> Florian

But we are running into segfaults because of trying to keep going in the 
face of malformed UTF-8.  I'm thinking the lesson should be to give up 
when we find it, and this is a reasonable place to start.  There are 
places where malformed UTF-8 is fatal.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About