On Fri Sep 16 13:34:55 2016, khw wrote: > On 09/16/2016 06:46 AM, Florian Schlichting wrote: > > Hi Karl, > > > > Father Chrysostomos wrote: > >> On Wed Aug 31 20:35:02 2016, khw wrote: > >>> Is the attach3ed like what you mean? > >> > >> Yes, that would work. > >> > >> It would be nice, too, if we could add the `near such and such' that > >> yyerror normally does. Maybe yyerror could have an extra option to > >> croak > >> instead of calling qerror. It already has a flags field. > > > > thanks for looking into this issue. I tested your patch and can > > confirm > > that it correctly treats single and double quotes the same: > > > > % ./perl -C0 -le 'print qq(print "\xB0C";)' | ./perl -I'lib' -Mutf8 > > -CS % -l > > Malformed utf8 at - line 1. > > > > % ./perl -C0 -le 'print qq(print \x27\xB0C\x27;)' | ./perl -I'lib' > > -Mutf8 -CS -l > > Malformed utf8 at - line 1. > > > > > > However, I feel a little uneasy about dying altogether. Currently > > Perl > > issues just a warning ("Malformed UTF-8 character") and that seems to > > be > > the approach with UTF-8 issues encountered in other places in toke.c > > as > > well. Most of the time, these will be strings displayed to the user, > > and > > they will mostly still be legible even with a few characters garbled > > or > > skipped. Don't you think "complain and carry on" is what users would > > expect? > > > > Florian > > > > > > But we are running into segfaults because of trying to keep going in > the > face of malformed UTF-8. I'm thinking the lesson should be to give up > when we find it, and this is a reasonable place to start. There are > places where malformed UTF-8 is fatal. I agree. If perl keeps going, then even if it does not crash, it will die on those malformed strings later. -- Father Chrysostomos --- via perlbug: queue: perl5 status: open https://rt.perl.org/Ticket/Display.html?id=126310Thread Previous | Thread Next