develooper Front page | perl.perl5.porters | Postings from April 2018

[perl #133101] Anomalies in handling malformed utf8 input

Thread Previous | Thread Next
From:
Ricardo SIGNES via RT
Date:
April 12, 2018 14:11
Subject:
[perl #133101] Anomalies in handling malformed utf8 input
Message ID:
rt-4.0.24-21565-1523542246-471.133101-15-0@perl.org
On Wed, 11 Apr 2018 14:35:20 -0700, grinnz@gmail.com wrote:
> Using the options -CSD (-CD makes the special ARGV handle used by -n open
> the passed filename with :utf8, -CS interprets the STDIN with :utf8) and
> -Mutf8 (for the source code passed to -e) should make these examples
> function as expected.
> 
> -Dan

I'm not sure this is sufficient explanation.  Consider:

  ~$ cat bad | perl -CAS -Mutf8 -lne 'print if /ąę/'
  ~$ cat bad | perl -CAS -Mutf8 -lne 'print if /[ąę]/'
  Malformed UTF-8 character (fatal) at -e line 1, <> line 1.

Our input comes from stdin, and we have use -CS, which means STDIN is assumed UTF-8.  In both cases, we use -Mutf8.  We only see a fatal error in the second case, when we have used a character class instead of a string.

-- 
rjbs

---
via perlbug:  queue: perl5 status: open
https://rt.perl.org/Ticket/Display.html?id=133101

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About