develooper Front page | perl.perl5.porters | Postings from March 2016

Re: [perl #126182] /(.(?2))((?<=(?=(?1)).))/ hangs and eats allavailable RAM

Thread Previous | Thread Next
From:
demerphq
Date:
March 8, 2016 17:40
Subject:
Re: [perl #126182] /(.(?2))((?<=(?=(?1)).))/ hangs and eats allavailable RAM
Message ID:
CANgJU+XOQW9fYJzqb32ypBq=9epDBoSzEy3B3yOFHzQcAdKBnA@mail.gmail.com
On 8 March 2016 at 18:20, Zefram <zefram@fysh.org> wrote:
> demerphq wrote:
>>Anyway, much of the dialog that you posted here IMO is unrelated to
>>regex infinite recursion, it relates to how some parts of the regex
>>handle matching the empty string.
>
> It's almost precisely equivalent.  The problems you're addressing in (?R)
> type recursion are the same ones that occur with * iteration, because the
> iteration really operates in a recursive fashion.  In general, /(x*)/
> is equivalent to /((?:x(?1))?)/, and for the purposes of which ways it
> can match those are equivalent to /((?:(?1)x)?)/.  The (documented)
> behaviour that captures by the recursively-called subpattern are not
> available to the calling pattern breaks some of my constructions, but
> generally the same problems apply.
>
>>Can you show a pattern where regex recursion does not do the right
>>thing?
>
> Just using left recursion, not emulating a zero-length iteration, consider
>
>         "abbc" =~ /a((?:(?1)b)?)c/
>
> Mathematically this does match, in exactly one way.  But the naive
> execution of the pattern would infinitely recurse, attempting to try out
> infinitely many paths that would not match before it'll consider the
> way that actually does.  So acceptable results are to yield a match,
> hang, exhaust memory, or signal an exception to forestall hanging.
> 5.22 signals "Infinite recursion in regex", which is fine.  blead says
> it does not match, which is the wrong answer.

Ok, then I will restore the "Infinite recursion in regex" so we do not
do the wrong thing.

>
> The left recursion can be rescued by changing the ? to ??, so preferring
> less recursion.  Of course this changes the equivalent iterative construct
> from * to *?.  With that change, blead correctly shows a match, but 5.22
> still signals "Infinite recursion".

Hmm.

>
> Looking at some zero-length-iteration cases, I found some curious
> behaviour on
>
>         "abc" =~ /a((?1)??)c/
>
> 5.22 signals "Infinite recursion" for this, not surprisingly, but
> blead hangs, slowly eating memory.  Hanging is acceptable here, as I've
> said, but this seems to be a regression from the point of view of your
> objective of avoiding hanging.  I'd expect your fail-on-recursion logic
> to (correctly) say that this does not match, in finite time.

Yes, me too. I am looking into why it doesn't.

>  There are
> closely related cases, such as the same thing with ? in place of ??,
> for which blead produces the right answer and 5.22 signals "Infinite
> recursion".

Could you help me with this and post some test you think should pass?

I am digging into the example you gave to figure out why it doesnt get
caught by the new algorithm.

>
>>The only problem are things like /(?<foo>(?&foo))/.
>
> With unconditional recursion, that has no way to match using a finite
> amount of structure, and mathematically it's tautological, so I'm fine
> about it being shortcutted.  I'm concerned about things that in principle
> can actually match.

I appreciate that concern and I share it too.

cheers,
Yves



-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About