Front page | perl.perl5.porters |
Postings from June 2011
[perl #93824] regex code blocks manipulating regex target can cause undefined behaviour
Thread Next
From:
Nicholas Clark
Date:
June 30, 2011 04:32
Subject:
[perl #93824] regex code blocks manipulating regex target can cause undefined behaviour
Message ID:
rt-3.6.HEAD-16080-1309433513-378.93824-75-0@perl.org
# New Ticket Created by Nicholas Clark
# Please include the string: [perl #93824]
# in the subject line of all future correspondence about this issue.
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=93824 >
This is a bug report for perl from nick@ccl4.org,
generated with the help of perlbug 1.39 running under perl 5.15.0.
-----------------------------------------------------------------
[Please describe your issue here]
The regex engine assumes that the scalar it's matching over can't change.
(In at least some cases)
If you use a (?{}) code block inside a regex to undefine the target scalar,
um:
$ valgrind ./perl -Ilib -le '$a = "ydydydyd"; warn $_ foreach $a =~ /[^x]d(?{undef $a})[^x]d/g'
==46337== Memcheck, a memory error detector
==46337== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==46337== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==46337== Command: ./perl -Ilib -le $a\ =\ "ydydydyd";\ warn\ $_\ foreach\ $a\ =~\ /[^x]d(?{undef\ $a})[^x]d/g
==46337==
--46337-- ./perl:
--46337-- dSYM directory is missing; consider using --dsymutil=yes
==46337== Invalid read of size 1
==46337== at 0x1001AB360: S_reginclass (in ./perl)
==46337== by 0x1001A174F: S_regmatch (in ./perl)
==46337== by 0x10019EADA: S_regtry (in ./perl)
==46337== by 0x1001935AA: Perl_regexec_flags (in ./perl)
==46337== by 0x1000EE85E: Perl_pp_match (in ./perl)
==46337== by 0x1000E5EE7: Perl_runops_standard (in ./perl)
==46337== by 0x10002B4BC: S_run_body (in ./perl)
==46337== by 0x10002ADEE: perl_run (in ./perl)
==46337== by 0x1000014D4: main (in ./perl)
==46337== Address 0x1006019b2 is 2 bytes inside a block of size 10 free'd
==46337== at 0x100280C7C: free (vg_replace_malloc.c:366)
==46337== by 0x1000AE8B5: Perl_safesysfree (in ./perl)
==46337== by 0x1001240F9: Perl_pp_undef (in ./perl)
==46337== by 0x1000E5EE7: Perl_runops_standard (in ./perl)
==46337== by 0x1001A4793: S_regmatch (in ./perl)
==46337== by 0x10019EADA: S_regtry (in ./perl)
==46337== by 0x1001935AA: Perl_regexec_flags (in ./perl)
==46337== by 0x1000EE85E: Perl_pp_match (in ./perl)
==46337== by 0x1000E5EE7: Perl_runops_standard (in ./perl)
==46337== by 0x10002B4BC: S_run_body (in ./perl)
==46337== by 0x10002ADEE: perl_run (in ./perl)
==46337== by 0x1000014D4: main (in ./perl)
==46337==
==46337== Invalid read of size 1
==46337== at 0x1001A17CB: S_regmatch (in ./perl)
==46337== by 0x10019EADA: S_regtry (in ./perl)
==46337== by 0x1001935AA: Perl_regexec_flags (in ./perl)
==46337== by 0x1000EE85E: Perl_pp_match (in ./perl)
==46337== by 0x1000E5EE7: Perl_runops_standard (in ./perl)
==46337== by 0x10002B4BC: S_run_body (in ./perl)
==46337== by 0x10002ADEE: perl_run (in ./perl)
==46337== by 0x1000014D4: main (in ./perl)
==46337== Address 0x1006019b3 is 3 bytes inside a block of size 10 free'd
==46337== at 0x100280C7C: free (vg_replace_malloc.c:366)
==46337== by 0x1000AE8B5: Perl_safesysfree (in ./perl)
==46337== by 0x1001240F9: Perl_pp_undef (in ./perl)
==46337== by 0x1000E5EE7: Perl_runops_standard (in ./perl)
==46337== by 0x1001A4793: S_regmatch (in ./perl)
==46337== by 0x10019EADA: S_regtry (in ./perl)
==46337== by 0x1001935AA: Perl_regexec_flags (in ./perl)
==46337== by 0x1000EE85E: Perl_pp_match (in ./perl)
==46337== by 0x1000E5EE7: Perl_runops_standard (in ./perl)
==46337== by 0x10002B4BC: S_run_body (in ./perl)
==46337== by 0x10002ADEE: perl_run (in ./perl)
==46337== by 0x1000014D4: main (in ./perl)
That would be a bad thing :-(
It's not a terrible thing, given that:
For reasons of security, this construct is forbidden if the regular
expression involves run-time interpolation of variables, unless the
perilous C<use re 'eval'> pragma has been used (see L<re>), or the
variables contain results of the C<qr//> operator (see
L<perlop/"qr/STRINGE<sol>msixpodual">).
My vague understanding of the engine is that there are mechanisms in place to
copy the target string. Should these also be triggered if the pattern contains
any code blocks? [or anything else that could have side effects *during* the
match, if anything else exists]
Nicholas Clark
[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
category=core
severity=low
---
Site configuration information for perl 5.15.0:
Configured by nick at Thu Jun 30 08:16:54 BST 2011.
Summary of my perl5 (revision 5 version 15 subversion 0) configuration:
Derived from: 5ef88e32837b528ef762bb5bdc3074489cf43a85
Platform:
osname=darwin, osvers=10.8.0, archname=darwin-2level
uname='darwin mouse-mill.local 10.8.0 darwin kernel version 10.8.0: tue jun 7 16:33:36 pdt 2011; root:xnu-1504.15.3~1release_i386 i386 '
config_args='-Dusedevel=y -Dcc=ccache clang -Dld=clang -Ubincompat5005 -Uinstallusrbinperl -Dcf_email=nick@ccl4.org -Dperladmin=nick@ccl4.org -Dinc_version_list= -Dinc_version_list_init=0 -Doptimize=-g -Uusethreads -Uuse64bitall -Uuselongdouble -Uusemymalloc -Duseperlio -Dprefix=~/Sandpit/snap5.9.x-v5.15.0-135-g5ef88e3-i -Dinstallman1dir=none -Dinstallman3dir=none -Dusevendorprefix -Dvendorprefix=~/Sandpit/vendor -Uuserelocatableinc -Ud_dosuid -Uuseshrplib -de -Accccflags=-DNO_PERL_PRESERVE_IVUV -Umad'
hint=recommended, useposix=true, d_sigaction=define
useithreads=undef, usemultiplicity=undef
useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
use64bitint=define, use64bitall=undef, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='ccache clang', ccflags ='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -fstack-protector -I/opt/local/include',
optimize='-g',
cppflags='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -fstack-protector -I/opt/local/include'
ccversion='', gccversion='4.2.1 Compatible Clang Compiler', gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags =' -fstack-protector -L/usr/local/lib -L/opt/local/lib'
libpth=/usr/local/lib /opt/local/lib /usr/lib
libs=-lgdbm -ldbm -ldl -lm -lutil -lc
perllibs=-ldl -lm -lutil -lc
libc=/usr/lib/libc.dylib, so=dylib, useshrplib=false, libperl=libperl.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup -L/usr/local/lib -L/opt/local/lib -fstack-protector'
Locally applied patches:
---
@INC for perl 5.15.0:
lib
/Users/nick/Sandpit/snap5.9.x-v5.15.0-135-g5ef88e3-i/lib/perl5/site_perl/5.15.0/darwin-2level
/Users/nick/Sandpit/snap5.9.x-v5.15.0-135-g5ef88e3-i/lib/perl5/site_perl/5.15.0
/Users/nick/Sandpit/vendor/lib/perl5/vendor_perl/5.15.0/darwin-2level
/Users/nick/Sandpit/vendor/lib/perl5/vendor_perl/5.15.0
/Users/nick/Sandpit/snap5.9.x-v5.15.0-135-g5ef88e3-i/lib/perl5/5.15.0/darwin-2level
/Users/nick/Sandpit/snap5.9.x-v5.15.0-135-g5ef88e3-i/lib/perl5/5.15.0
.
---
Environment for perl 5.15.0:
DYLD_LIBRARY_PATH (unset)
HOME=/Users/nick
LANG (unset)
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/opt/local/bin:/opt/local/sbin:/Users/nick/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/local/sbin:/sbin:/usr/sbin
PERL_BADLANG (unset)
SHELL=/bin/bash
Thread Next
-
[perl #93824] regex code blocks manipulating regex target can cause undefined behaviour
by Nicholas Clark