develooper Front page | perl.perl5.porters | Postings from June 2011

[perl #93824] regex code blocks manipulating regex target can cause undefined behaviour

Thread Next
From:
Nicholas Clark
Date:
June 30, 2011 04:32
Subject:
[perl #93824] regex code blocks manipulating regex target can cause undefined behaviour
Message ID:
rt-3.6.HEAD-16080-1309433513-378.93824-75-0@perl.org
# New Ticket Created by  Nicholas Clark 
# Please include the string:  [perl #93824]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=93824 >



This is a bug report for perl from nick@ccl4.org,
generated with the help of perlbug 1.39 running under perl 5.15.0.


-----------------------------------------------------------------
[Please describe your issue here]

The regex engine assumes that the scalar it's matching over can't change.
(In at least some cases)

If you use a (?{}) code block inside a regex to undefine the target scalar,
um:

$ valgrind ./perl -Ilib -le '$a = "ydydydyd"; warn $_ foreach $a =~ /[^x]d(?{undef $a})[^x]d/g'
==46337== Memcheck, a memory error detector
==46337== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==46337== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==46337== Command: ./perl -Ilib -le $a\ =\ "ydydydyd";\ warn\ $_\ foreach\ $a\ =~\ /[^x]d(?{undef\ $a})[^x]d/g
==46337== 
--46337-- ./perl:
--46337-- dSYM directory is missing; consider using --dsymutil=yes
==46337== Invalid read of size 1
==46337==    at 0x1001AB360: S_reginclass (in ./perl)
==46337==    by 0x1001A174F: S_regmatch (in ./perl)
==46337==    by 0x10019EADA: S_regtry (in ./perl)
==46337==    by 0x1001935AA: Perl_regexec_flags (in ./perl)
==46337==    by 0x1000EE85E: Perl_pp_match (in ./perl)
==46337==    by 0x1000E5EE7: Perl_runops_standard (in ./perl)
==46337==    by 0x10002B4BC: S_run_body (in ./perl)
==46337==    by 0x10002ADEE: perl_run (in ./perl)
==46337==    by 0x1000014D4: main (in ./perl)
==46337==  Address 0x1006019b2 is 2 bytes inside a block of size 10 free'd
==46337==    at 0x100280C7C: free (vg_replace_malloc.c:366)
==46337==    by 0x1000AE8B5: Perl_safesysfree (in ./perl)
==46337==    by 0x1001240F9: Perl_pp_undef (in ./perl)
==46337==    by 0x1000E5EE7: Perl_runops_standard (in ./perl)
==46337==    by 0x1001A4793: S_regmatch (in ./perl)
==46337==    by 0x10019EADA: S_regtry (in ./perl)
==46337==    by 0x1001935AA: Perl_regexec_flags (in ./perl)
==46337==    by 0x1000EE85E: Perl_pp_match (in ./perl)
==46337==    by 0x1000E5EE7: Perl_runops_standard (in ./perl)
==46337==    by 0x10002B4BC: S_run_body (in ./perl)
==46337==    by 0x10002ADEE: perl_run (in ./perl)
==46337==    by 0x1000014D4: main (in ./perl)
==46337== 
==46337== Invalid read of size 1
==46337==    at 0x1001A17CB: S_regmatch (in ./perl)
==46337==    by 0x10019EADA: S_regtry (in ./perl)
==46337==    by 0x1001935AA: Perl_regexec_flags (in ./perl)
==46337==    by 0x1000EE85E: Perl_pp_match (in ./perl)
==46337==    by 0x1000E5EE7: Perl_runops_standard (in ./perl)
==46337==    by 0x10002B4BC: S_run_body (in ./perl)
==46337==    by 0x10002ADEE: perl_run (in ./perl)
==46337==    by 0x1000014D4: main (in ./perl)
==46337==  Address 0x1006019b3 is 3 bytes inside a block of size 10 free'd
==46337==    at 0x100280C7C: free (vg_replace_malloc.c:366)
==46337==    by 0x1000AE8B5: Perl_safesysfree (in ./perl)
==46337==    by 0x1001240F9: Perl_pp_undef (in ./perl)
==46337==    by 0x1000E5EE7: Perl_runops_standard (in ./perl)
==46337==    by 0x1001A4793: S_regmatch (in ./perl)
==46337==    by 0x10019EADA: S_regtry (in ./perl)
==46337==    by 0x1001935AA: Perl_regexec_flags (in ./perl)
==46337==    by 0x1000EE85E: Perl_pp_match (in ./perl)
==46337==    by 0x1000E5EE7: Perl_runops_standard (in ./perl)
==46337==    by 0x10002B4BC: S_run_body (in ./perl)
==46337==    by 0x10002ADEE: perl_run (in ./perl)
==46337==    by 0x1000014D4: main (in ./perl)


That would be a bad thing :-(

It's not a terrible thing, given that:

    For reasons of security, this construct is forbidden if the regular
    expression involves run-time interpolation of variables, unless the
    perilous C<use re 'eval'> pragma has been used (see L<re>), or the
    variables contain results of the C<qr//> operator (see
    L<perlop/"qr/STRINGE<sol>msixpodual">).

My vague understanding of the engine is that there are mechanisms in place to
copy the target string. Should these also be triggered if the pattern contains
any code blocks? [or anything else that could have side effects *during* the
match, if anything else exists]

Nicholas Clark

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
    category=core
    severity=low
---
Site configuration information for perl 5.15.0:

Configured by nick at Thu Jun 30 08:16:54 BST 2011.

Summary of my perl5 (revision 5 version 15 subversion 0) configuration:
  Derived from: 5ef88e32837b528ef762bb5bdc3074489cf43a85
  Platform:
    osname=darwin, osvers=10.8.0, archname=darwin-2level
    uname='darwin mouse-mill.local 10.8.0 darwin kernel version 10.8.0: tue jun 7 16:33:36 pdt 2011; root:xnu-1504.15.3~1release_i386 i386 '
    config_args='-Dusedevel=y -Dcc=ccache clang -Dld=clang -Ubincompat5005 -Uinstallusrbinperl -Dcf_email=nick@ccl4.org -Dperladmin=nick@ccl4.org -Dinc_version_list=  -Dinc_version_list_init=0 -Doptimize=-g -Uusethreads -Uuse64bitall -Uuselongdouble -Uusemymalloc -Duseperlio -Dprefix=~/Sandpit/snap5.9.x-v5.15.0-135-g5ef88e3-i -Dinstallman1dir=none -Dinstallman3dir=none -Dusevendorprefix -Dvendorprefix=~/Sandpit/vendor -Uuserelocatableinc -Ud_dosuid -Uuseshrplib -de -Accccflags=-DNO_PERL_PRESERVE_IVUV -Umad'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='ccache clang', ccflags ='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -fstack-protector -I/opt/local/include',
    optimize='-g',
    cppflags='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -fstack-protector -I/opt/local/include'
    ccversion='', gccversion='4.2.1 Compatible Clang Compiler', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags =' -fstack-protector -L/usr/local/lib -L/opt/local/lib'
    libpth=/usr/local/lib /opt/local/lib /usr/lib
    libs=-lgdbm -ldbm -ldl -lm -lutil -lc
    perllibs=-ldl -lm -lutil -lc
    libc=/usr/lib/libc.dylib, so=dylib, useshrplib=false, libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup -L/usr/local/lib -L/opt/local/lib -fstack-protector'

Locally applied patches:
    

---
@INC for perl 5.15.0:
    lib
    /Users/nick/Sandpit/snap5.9.x-v5.15.0-135-g5ef88e3-i/lib/perl5/site_perl/5.15.0/darwin-2level
    /Users/nick/Sandpit/snap5.9.x-v5.15.0-135-g5ef88e3-i/lib/perl5/site_perl/5.15.0
    /Users/nick/Sandpit/vendor/lib/perl5/vendor_perl/5.15.0/darwin-2level
    /Users/nick/Sandpit/vendor/lib/perl5/vendor_perl/5.15.0
    /Users/nick/Sandpit/snap5.9.x-v5.15.0-135-g5ef88e3-i/lib/perl5/5.15.0/darwin-2level
    /Users/nick/Sandpit/snap5.9.x-v5.15.0-135-g5ef88e3-i/lib/perl5/5.15.0
    .

---
Environment for perl 5.15.0:
    DYLD_LIBRARY_PATH (unset)
    HOME=/Users/nick
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/opt/local/bin:/opt/local/sbin:/Users/nick/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/local/sbin:/sbin:/usr/sbin
    PERL_BADLANG (unset)
    SHELL=/bin/bash


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About