develooper Front page | perl.perl5.porters | Postings from July 2013

Re: [perl #3634] Capture corruption through self-modying regexp(?{...})

Thread Previous
From:
Nicholas Clark
Date:
July 29, 2013 14:30
Subject:
Re: [perl #3634] Capture corruption through self-modying regexp(?{...})
Message ID:
20130729143023.GX3729@plum.flirble.org
On Sun, Jul 28, 2013 at 01:21:23AM +0100, Dave Mitchell wrote:
> On Sat, Jul 27, 2013 at 07:05:39AM -0700, Father Chrysostomos via RT wrote:
> > On Thu Jun 14 15:13:18 2012, davem wrote:
> > > On Thu, Jun 14, 2012 at 09:48:34AM -0700, Father Chrysostomos via RT
> > wrote:
> > > > On Thu Aug 03 18:02:21 2000, jfriedl@yahoo-inc.com wrote:
> > > > > 
> > > > >     #!/usr/local/bin/perl -w
> > > > >     use strict;
> > > > > 
> > > > >     my $text = "a";
> > > > >     $text =~ m/(.(?{ $text .= "x" }))*/;
> > > > > 
> > > > >     print "text is [$text]\n";
> > > > >     print "length of text: ", length($text), "\n";
> > > > >     print "starts: ", join('|', @-), "\n";
> > > > >     print "ends  : ", join('|', @-), "\n";
> > > > >     printf("length of match parts: [%d|%d|%d]\n", length($`),
> > > > > length($&), length($'));
> > > > >     printf("match itself: [%s|%s|%s]\n", map { defined($_) ? $_ : 'X'}
> > > > > $`, $&, $');
> > > > >     print "\$1[$1]\n";
> > > > > 
> > > > > prints (when piped through cat -v):
> > > > > 
> > > > >     text is [axxxxxxxxx]
> > > > >     length of text: 10
> > > > >     starts: 0|7
> > > > >     ends  : 0|7
> > > > >     length of match parts: [0|8|0]
> > > > >     match itself: [|a^@^X@M-hd^O^H|X]
> > > > >     $1[^H]
> > > > 
> > > > This is still a problem in bleadperl (c8d84f8c67a), even after Dave
> > > > Mitchell's jumbo re-eval rewrite.
> > > 
> > > Yep, that's the one ticket in the metaticket that's not fixed yet.
> > 
> > This appears to be fixed now, and I suspect it is because of
> > PERL_NEW_COPY_ON_WRITE (meaning the bug is still present under
> > -Accflags=-DPERL_NO_COW), but I haven't checked.
> 
> The assertion failures stop with the following commit, according to
> bisect, although I haven't looked closely to decide whether this
> is actually the complete fix or whether anything still needs addresssing.
> 
> commit 7016d6ebb4afd4eb7b71b00f15b7515b5e45fee8
> Author: David Mitchell <davem@iabyn.com>
> Date:   Fri Sep 21 10:29:04 2012 +0100
> 
>     stop regex engine reading beyond end of string

For the given test case, the errors also stop at that commit.
(It wasn't clear to me whether you had confirmed this, as you only mentioned
the assertions)

I ran this:

Porting/bisect.pl --expect-fail --target=miniperl -we '$_ = "a"; /(.(?{ $_ .= "x" }))*/; $_ = $&; die $_ if /[^ax]/'

it predates the merge of COW, so that won't affect it, but I did also try
this, which got the same answer:

Porting/bisect.pl -Accflags=-DPERL_NO_COW --expect-fail --target=miniperl -we '$_ = "a"; /(.(?{ $_ .= "x" }))*/; $_ = $&; die $_ if /[^ax]/'


[From the commit message]

>     To track these down, I temporarily hacked regexec_flags() to make a copy
>     of the string but without trailing \0, then ran all the t/re/*.t tests
>     under valgrind to flush out all buffer overruns. So I think I've removed
>     most of the bad code, but by no means all of it. The code within the
>     various functions in regexec.c is far too complex to be able to visually
>     audit the code with any confidence.

I guess it would also be possible to hack this to SEGV without valgrind by
using mmap to map an anonymous block, copying the string (without 
trailing \0) to abut the end, and then matching on that.

But I guess that this won't reveal any more than the failures you already
have fixed, as the limiting thing here is the testsuite, not the analysis
tools.

Would a fuzzer help?

Nicholas Clark

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About