develooper Front page | perl.perl5.porters | Postings from June 2013

[perl #113824] Regexp error messages are not UTF8-clean

Thread Previous | Thread Next
From:
Father Chrysostomos via RT
Date:
June 18, 2013 01:28
Subject:
[perl #113824] Regexp error messages are not UTF8-clean
Message ID:
rt-3.6.HEAD-2552-1371518889-204.113824-15-0@perl.org
On Mon Jun 17 17:58:45 2013, jkeenan wrote:
> On Sun Jun 24 14:18:50 2012, webmasters@ctosonline.org wrote:
> > On a UTF-8 terminal:
> > 
> > $ ./perl -Ilib -CS -e 'use utf8; /�+++/'
> > Nested quantifiers in regex; marked by <-- HERE in m/ü+++ <-- HERE /
> >    at -e line 1.
> > 
> > ---
> > Flags:
> >     category=core
> >     severity=low
> > ---
> > Site configuration information for perl 5.17.0:
> 
> This ticket has not received a response since filing more than a year ago.
> 
> Could someone who understands what a "UTF-8 terminal" is take a look?

A UTF-8 terminal is one that uses UTF-8 for its character set, so that
typing the character ā inputs the sequence c4 81 and likewise the
sequence c4 81 is displayed as ā.

Now, RT has managed to screw it up completely, but the bug can be
demonstrated like this instead:

$ ./perl -Ilib -CS -e '$c = chr 0x100; /$c+++/' 2>&1 | LANG=C less -U

And less shows:

Nested quantifiers in regex; marked by <-- HERE in m/<C3><84><C2><80>+++
<-- HERE / at -e line 1.

I have -CS set, so the standard handles should output UTF-8.  c3 84 c2
80 is not the UTF-8 sequence for chr 0x100.

Another way to demonstrate it:

use Data::Dumper;
++$Data::Dumper::Useqq;

$c = chr 0x100;
print Dumper $c;
eval '/$c+++/';
print Dumper $@;
__END__
$VAR1 = "\x{100}";
$VAR1 = "Nested quantifiers in regex; marked by <-- HERE in
m/\304\200+++ <-- HERE / at (eval 3) line 1.\n";

The \304\200 should be \x{100}.

-- 

Father Chrysostomos


---
via perlbug:  queue: perl5 status: open
https://rt.perl.org:443/rt3/Ticket/Display.html?id=113824

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About