Front page | perl.perl5.porters |
Postings from March 2016
Re: [perl #126182] /(.(?2))((?<=(?=(?1)).))/ hangs and eats allavailable RAM
Thread Previous
|
Thread Next
From:
demerphq
Date:
March 8, 2016 17:40
Subject:
Re: [perl #126182] /(.(?2))((?<=(?=(?1)).))/ hangs and eats allavailable RAM
Message ID:
CANgJU+XOQW9fYJzqb32ypBq=9epDBoSzEy3B3yOFHzQcAdKBnA@mail.gmail.com
On 8 March 2016 at 18:20, Zefram <zefram@fysh.org> wrote:
> demerphq wrote:
>>Anyway, much of the dialog that you posted here IMO is unrelated to
>>regex infinite recursion, it relates to how some parts of the regex
>>handle matching the empty string.
>
> It's almost precisely equivalent. The problems you're addressing in (?R)
> type recursion are the same ones that occur with * iteration, because the
> iteration really operates in a recursive fashion. In general, /(x*)/
> is equivalent to /((?:x(?1))?)/, and for the purposes of which ways it
> can match those are equivalent to /((?:(?1)x)?)/. The (documented)
> behaviour that captures by the recursively-called subpattern are not
> available to the calling pattern breaks some of my constructions, but
> generally the same problems apply.
>
>>Can you show a pattern where regex recursion does not do the right
>>thing?
>
> Just using left recursion, not emulating a zero-length iteration, consider
>
> "abbc" =~ /a((?:(?1)b)?)c/
>
> Mathematically this does match, in exactly one way. But the naive
> execution of the pattern would infinitely recurse, attempting to try out
> infinitely many paths that would not match before it'll consider the
> way that actually does. So acceptable results are to yield a match,
> hang, exhaust memory, or signal an exception to forestall hanging.
> 5.22 signals "Infinite recursion in regex", which is fine. blead says
> it does not match, which is the wrong answer.
Ok, then I will restore the "Infinite recursion in regex" so we do not
do the wrong thing.
>
> The left recursion can be rescued by changing the ? to ??, so preferring
> less recursion. Of course this changes the equivalent iterative construct
> from * to *?. With that change, blead correctly shows a match, but 5.22
> still signals "Infinite recursion".
Hmm.
>
> Looking at some zero-length-iteration cases, I found some curious
> behaviour on
>
> "abc" =~ /a((?1)??)c/
>
> 5.22 signals "Infinite recursion" for this, not surprisingly, but
> blead hangs, slowly eating memory. Hanging is acceptable here, as I've
> said, but this seems to be a regression from the point of view of your
> objective of avoiding hanging. I'd expect your fail-on-recursion logic
> to (correctly) say that this does not match, in finite time.
Yes, me too. I am looking into why it doesn't.
> There are
> closely related cases, such as the same thing with ? in place of ??,
> for which blead produces the right answer and 5.22 signals "Infinite
> recursion".
Could you help me with this and post some test you think should pass?
I am digging into the example you gave to figure out why it doesnt get
caught by the new algorithm.
>
>>The only problem are things like /(?<foo>(?&foo))/.
>
> With unconditional recursion, that has no way to match using a finite
> amount of structure, and mathematically it's tautological, so I'm fine
> about it being shortcutted. I'm concerned about things that in principle
> can actually match.
I appreciate that concern and I share it too.
cheers,
Yves
--
perl -Mre=debug -e "/just|another|perl|hacker/"
Thread Previous
|
Thread Next