develooper Front page | perl.perl5.porters | Postings from July 2018

Re: [perl #133352] Ancient Regex Regression

Thread Previous | Thread Next
From:
Deven T. Corzine
Date:
July 17, 2018 17:55
Subject:
Re: [perl #133352] Ancient Regex Regression
Message ID:
CAFVdu0SjXindCndDc6wYC=_a3qPt78OigsO8-GUsJOg3XSrXpQ@mail.gmail.com
 On Tue, Jul 17, 2018 at 12:58 PM David Nicol <davidnicol@gmail.com> wrote:

>
> This is a good test case for the bug:
>
> On the other hand, my patch does allow this example to match:
>
>      print "matched\n" if "ABCDA" =~ /^ (?: (.)B | CD )* \1 $/x;
>
> Without my patch, this matches instead:
>
>      print "matched\n" if "ABCDC" =~ /^ (?: (.)B | CD )* \1 $/x;
>
>
>
> The bug appears to be the result of an optimization of describing the
> capture buffer with an offset -- essentially a dynamic substring expression
> -- rather than copying the captured string into it.
> Were the capture buffer to be copied into, it would get 'A'  (the
> character before the B) rather than 'C ' (the first character in the match)
> and the behavior would be the same as what the other regex engines do.
>
> Is that what the patch changes?
>

That optimization doesn't cause the bug, it's the attempt to match the (.)
again against "CD" that causes it -- the (.) matches, but the "D" doesn't,
and it doesn't restore the original capture.

Deven

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About