develooper Front page | perl.perl5.porters | Postings from September 2016

[perl #126310] no "Malformed UTF-8 character" warning onsingle-quoted strings under "use utf8"

Thread Previous | Thread Next
From:
Father Chrysostomos via RT
Date:
September 16, 2016 22:44
Subject:
[perl #126310] no "Malformed UTF-8 character" warning onsingle-quoted strings under "use utf8"
Message ID:
rt-4.0.24-25033-1474065881-7.126310-15-0@perl.org
On Fri Sep 16 13:34:55 2016, khw wrote:
> On 09/16/2016 06:46 AM, Florian Schlichting wrote:
> > Hi Karl,
> >
> > Father Chrysostomos wrote:
> >> On Wed Aug 31 20:35:02 2016, khw wrote:
> >>> Is the attach3ed like what you mean?
> >>
> >> Yes, that would work.
> >>
> >> It would be nice, too, if we could add the `near such and such' that
> >> yyerror normally does. Maybe yyerror could have an extra option to
> >> croak
> >> instead of calling qerror. It already has a flags field.
> >
> > thanks for looking into this issue. I tested your patch and can
> > confirm
> > that it correctly treats single and double quotes the same:
> >
> > % ./perl -C0 -le 'print qq(print "\xB0C";)' | ./perl -I'lib' -Mutf8
> > -CS % -l
> > Malformed utf8 at - line 1.
> >
> > % ./perl -C0 -le 'print qq(print \x27\xB0C\x27;)' | ./perl -I'lib'
> > -Mutf8 -CS -l
> > Malformed utf8 at - line 1.
> >
> >
> > However, I feel a little uneasy about dying altogether. Currently
> > Perl
> > issues just a warning ("Malformed UTF-8 character") and that seems to
> > be
> > the approach with UTF-8 issues encountered in other places in toke.c
> > as
> > well. Most of the time, these will be strings displayed to the user,
> > and
> > they will mostly still be legible even with a few characters garbled
> > or
> > skipped. Don't you think "complain and carry on" is what users would
> > expect?
> >
> > Florian
> >
> >
> 
> But we are running into segfaults because of trying to keep going in
> the
> face of malformed UTF-8.  I'm thinking the lesson should be to give up
> when we find it, and this is a reasonable place to start.  There are
> places where malformed UTF-8 is fatal.

I agree.  If perl keeps going, then even if it does not crash, it will die on those malformed strings later.

-- 

Father Chrysostomos


---
via perlbug:  queue: perl5 status: open
https://rt.perl.org/Ticket/Display.html?id=126310

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About