Front page | perl.perl5.porters |
Postings from June 2008
[perl #55250] utf-8 regex case insensitive character classes mishandle non-utf8 strings
Thread Previous
|
Thread Next
From:
John Gardiner Myers
Date:
June 4, 2008 01:44
Subject:
[perl #55250] utf-8 regex case insensitive character classes mishandle non-utf8 strings
Message ID:
rt-3.6.HEAD-11257-1212529954-1057.55250-75-0@perl.org
# New Ticket Created by John Gardiner Myers
# Please include the string: [perl #55250]
# in the subject line of all future correspondence about this issue.
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=55250 >
This is a bug report for perl from jgmyers@proofpoint.com,
generated with the help of perlbug 1.35 running under perl v5.8.8.
-----------------------------------------------------------------
[Please enter your report here]
Regular expressions with case insensitive character classes
incorrectly parse non-utf8 strings as if they are utf8. This bug
reproduces in both 5.8.8 and 5.10.0. Test cases follow:
use strict;
use warnings;
use Test::Warn;
use Test::More qw(no_plan);
warnings_are {ok("\xa9" !~ /[\x{400}-\x{4ff}]/i)} [], "no warnings";
warnings_are {ok("\xc0" =~ /^[\x{400}-\x{4ff}\xc0]/i)} [], "no warnings";
warnings_are {ok("\xe0" =~ /^[\x{400}-\x{4ff}\xc0]/i)} [], "no warnings";
This incorrectly produces the output:
ok 1
not ok 2 - no warnings
# Failed test 'no warnings'
# in /u/jgmyers/nonutf8.t at line 6.
# found warning: Malformed UTF-8 character (unexpected continuation byte
0xa9, with no preceding start byte) in pattern match (m//) at
/u/jgmyers/nonutf8.t line 6.
# didn't expect to find a warning
ok 3
not ok 4 - no warnings
# Failed test 'no warnings'
# in /u/jgmyers/nonutf8.t at line 7.
# found warning: Malformed UTF-8 character (unexpected non-continuation
byte 0x00, immediately after start byte 0xc0) in pattern match (m//) at
/u/jgmyers/nonutf8.t line 7.
# didn't expect to find a warning
not ok 5
# Failed test in /u/jgmyers/nonutf8.t at line 8.
not ok 6 - no warnings
# Failed test 'no warnings'
# in /u/jgmyers/nonutf8.t at line 8.
# found warning: Malformed UTF-8 character (unexpected non-continuation
byte 0x00, immediately after start byte 0xe0) in pattern match (m//) at
/u/jgmyers/nonutf8.t line 8.
# didn't expect to find a warning
1..6
# Looks like you failed 4 tests of 6.
[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
category=core
severity=medium
---
Site configuration information for perl v5.8.8:
Configured by jthaler at Tue May 6 14:16:13 PDT 2008.
Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
Platform:
osname=linux, osvers=2.4.21-47.0.1.elsmp,
archname=i686-linux-thread-multi
uname='linux xenon3 2.4.21-47.0.1.elsmp #1 smp thu oct 19 11:33:45
edt 2006 i686 gnulinux '
config_args='-de -Dprefix=/tools/x2/gcc-4.2.2-pps-5.5/perl-5.8.8
-Dcc=gcc -Uinstallusrbinperl -Dusethreads
-Dlibpth=/tools/x2/gcc-4.2.2-pps-5.5/lib /lib /usr/lib
-Dlocincpth=/tools/x2/gcc-4.2.2-pps-5.5/include
-Dloclibpth=/tools/x2/gcc-4.2.2-pps-5.5/lib
-Dcf_email=xtools@proofpoint.com
-Di_gdbm=/tools/x2/gcc-4.2.2-pps-5.5/gdbm/include -Dusemallocwrap=n'
hint=recommended, useposix=true, d_sigaction=define
usethreads=define use5005threads=undef useithreads=define
usemultiplicity=define
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS
-fno-strict-aliasing -pipe -Wdeclaration-after-statement
-I/tools/x2/gcc-4.2.2-pps-5.5/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
optimize='-O2',
cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS
-fno-strict-aliasing -pipe -Wdeclaration-after-statement
-I/tools/x2/gcc-4.2.2-pps-5.5/include -I/usr/include/gdbm'
ccversion='', gccversion='4.2.2', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries:
ld='gcc', ldflags =' -L/tools/x2/gcc-4.2.2-pps-5.5/lib'
libpth=/tools/x2/gcc-4.2.2-pps-5.5/lib /lib /usr/lib
libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
libc=/lib/libc-2.3.2.so, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version='2.3.2'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fpic', lddlflags='-shared
-L/tools/x2/gcc-4.2.2-pps-5.5/lib'
Locally applied patches:
---
@INC for perl v5.8.8:
/tools/x2/gcc-4.2.2-pps-5.5/perl-5.8.8/lib/5.8.8/i686-linux-thread-multi
/tools/x2/gcc-4.2.2-pps-5.5/perl-5.8.8/lib/5.8.8
/tools/x2/gcc-4.2.2-pps-5.5/perl-5.8.8/lib/site_perl/5.8.8/i686-linux-thread-multi
/tools/x2/gcc-4.2.2-pps-5.5/perl-5.8.8/lib/site_perl/5.8.8
/tools/x2/gcc-4.2.2-pps-5.5/perl-5.8.8/lib/site_perl
.
---
Environment for perl v5.8.8:
HOME=/u/jgmyers
LANG=en_US.utf8
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/tools/x/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/u/jgmyers/bin
PERL_BADLANG (unset)
SHELL=/bin/bash
Thread Previous
|
Thread Next