On Tue, Jul 17, 2018 at 12:58 PM David Nicol <davidnicol@gmail.com> wrote: > > This is a good test case for the bug: > > On the other hand, my patch does allow this example to match: > > print "matched\n" if "ABCDA" =~ /^ (?: (.)B | CD )* \1 $/x; > > Without my patch, this matches instead: > > print "matched\n" if "ABCDC" =~ /^ (?: (.)B | CD )* \1 $/x; > > > > The bug appears to be the result of an optimization of describing the > capture buffer with an offset -- essentially a dynamic substring expression > -- rather than copying the captured string into it. > Were the capture buffer to be copied into, it would get 'A' (the > character before the B) rather than 'C ' (the first character in the match) > and the behavior would be the same as what the other regex engines do. > > Is that what the patch changes? > That optimization doesn't cause the bug, it's the attempt to match the (.) again against "CD" that causes it -- the (.) matches, but the "D" doesn't, and it doesn't restore the original capture. DevenThread Previous | Thread Next