Front page | perl.perl5.porters |
Postings from June 2013
[perl #118297] Mixing up- and down-graded strings in regex broken in 5.18.0
Thread Previous
From:
D. Ilmari Mannsåker
Date:
June 4, 2013 18:59
Subject:
[perl #118297] Mixing up- and down-graded strings in regex broken in 5.18.0
Message ID:
rt-3.6.HEAD-2552-1370355608-703.118297-75-0@perl.org
# New Ticket Created by "D. Ilmari Mannsåker"
# Please include the string: [perl #118297]
# in the subject line of all future correspondence about this issue.
# <URL: https://rt.perl.org:443/rt3/Ticket/Display.html?id=118297 >
This is a bug report for perl from ilmari.mannsaker@net-a-porter.com,
generated with the help of perlbug 1.39 running under perl 5.18.0.
-----------------------------------------------------------------
[Please describe your issue here]
$ perl -e 'utf8::upgrade(my $u = "\x{e5}"); utf8::downgrade(my $d =
"\x{e5}"); qr{$u $d}'
Malformed UTF-8 character (1 byte, need 3, after start byte 0xe5) in
regexp compilation at -e line 1.
Malformed UTF-8 character (1 byte, need 3, after start byte 0xe5) in
regexp compilation at -e line 1.
$ ../perl/Porting/bisect.pl -j6 --target=miniperl --start=v5.17.11
--end=v5.18.0-RC1 -e '$u = "\x{666}"; $d = "\x{e5}"; $SIG{__WARN__} =
sub { die $_[0] }; qr{$u $d}'
35738543f95c2bc8c0545f370c642a84a0fb4b69 is the first bad commit
commit 35738543f95c2bc8c0545f370c642a84a0fb4b69
Author: David Mitchell <davem@iabyn.com>
Date: Mon Apr 15 17:18:30 2013 +0100
Perl_re_op_compile(): handle utf8 concating better
When concatting the list of arguments together to form a final pattern
string, the code formerly did a quick scan of all the args first, and
if any of them were SvUTF8, it set the (empty) destination string
to UTF8
before concatting all the individual args. This avoided the pattern
getting upgraded to utf8 halfway through, and thus the indices for code
blocks becoming invalid.
However this was not 100% reliable because, as an "XXX" code comment of
mine pointed out, when overloading is involved it is possible for
an arg
to appear initially not to be utf8, but to be utf8 when its value is
finally accessed. This results an obscure bug (as shown in the test
added
for this commit), where literal /(?{code})/ still required 'use re
"eval"'.
The fix for this is to instead adjust the code block indices on the fly
if the pattern string happens to get upgraded to utf8. This is easy(er)
now that we have the new S_pat_upgrade_to_utf8() function.
As well as fixing the bug, this also simplifies the main concat loop in
the code, which will make it easier to handle interpolating arrays
(e.g.
/@foo/) when we move the interpolation from the join op into the regex
engine itself shortly.
:100644 100644 f29284632e54afb24df68ec2d0ebfacd8eac5497
f7f309b281a6683815efa0f6d06b5661ffa41b84 M regcomp.c
:040000 040000 27e6c237516a8f9cb3caf0745da433604ab15764
e627a5a459c0bc59d1e0cd8d8f4d837e306d983f M t
bisect run success
That took 321 seconds
[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
category=core
severity=medium
---
Site configuration information for perl 5.18.0:
Configured by ilmari at Mon May 20 10:43:21 BST 2013.
Summary of my perl5 (revision 5 version 18 subversion 0) configuration:
Platform:
osname=linux, osvers=3.2.0-41-generic, archname=x86_64-linux
uname='linux zarquon 3.2.0-41-generic #66-ubuntu smp thu apr 25
03:27:11 utc 2013 x86_64 x86_64 x86_64 gnulinux '
config_args='-de
-Dprefix=/home/ilmari/perl5/perlbrew/perls/perl-5.18.0
-Aeval:scriptdir=/home/ilmari/perl5/perlbrew/perls/perl-5.18.0/bin'
hint=recommended, useposix=true, d_sigaction=define
useithreads=undef, usemultiplicity=undef
useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
use64bitint=define, use64bitall=define, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector
-I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O2',
cppflags='-fno-strict-aliasing -pipe -fstack-protector
-I/usr/local/include'
ccversion='', gccversion='4.6.3', gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
libpth=/usr/local/lib /lib/x86_64-linux-gnu /lib/../lib
/usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /usr/lib
libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
libc=, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version='2.15'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib
-fstack-protector'
Locally applied patches:
---
@INC for perl 5.18.0:
/home/ilmari/perl5/perlbrew/perls/perl-5.18.0/lib/site_perl/5.18.0/x86_64-linux
/home/ilmari/perl5/perlbrew/perls/perl-5.18.0/lib/site_perl/5.18.0
/home/ilmari/perl5/perlbrew/perls/perl-5.18.0/lib/5.18.0/x86_64-linux
/home/ilmari/perl5/perlbrew/perls/perl-5.18.0/lib/5.18.0
.
---
Environment for perl 5.18.0:
HOME=/home/ilmari
LANG=en_GB.UTF-8
LANGUAGE=en_GB:en
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/home/ilmari/perl5/perlbrew/bin:/home/ilmari/perl5/perlbrew/perls/perl-5.18.0/bin:/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
PERLBREW_BASHRC_VERSION=0.61
PERLBREW_CPAN_MIRROR=http://cpanmirror.wtf.nap/
PERLBREW_HOME=/home/ilmari/.perlbrew
PERLBREW_MANPATH=/home/ilmari/perl5/perlbrew/perls/perl-5.18.0/man
PERLBREW_PATH=/home/ilmari/perl5/perlbrew/bin:/home/ilmari/perl5/perlbrew/perls/perl-5.18.0/bin
PERLBREW_PERL=perl-5.18.0
PERLBREW_ROOT=/home/ilmari/perl5/perlbrew
PERLBREW_VERSION=0.61
PERL_BADLANG (unset)
PERL_CPANM_OPT=--mirror=http://cpanmirror.wtf.nap/ --mirror-only
SHELL=/bin/bash
NET-A-PORTER.COM
Irresistible fashion at your fingertips
CONFIDENTIALITY NOTICE
The information in this email is confidential and is intended solely for the addressee. Access to this email by anyone else is unauthorised. If you are not the intended recipient, you must not read, use or disseminate the information. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Net-A-Porter Group Limited.
The Net-A-Porter Group Limited is a company registered in England & Wales Number: 3820604 Registered Office: 1 The Village Offices, Westfield, Ariel Way, London, W12 7GF
Thread Previous