develooper Front page | perl.perl5.porters | Postings from February 2009

Re: regexp iteration limits

Thread Previous | Thread Next
From:
Zefram
Date:
February 22, 2009 15:11
Subject:
Re: regexp iteration limits
Message ID:
20090222231104.GE7057@fysh.org
demerphq wrote:
>                                      I guess we could bifurcate all
>the regops, but that gets messy.

I think only the iteration regops need this bifurcation.
In /(?>(?:...)*)/, the "*" can be told at parse time that it's in a
no-backtracking context, and it can compile to a regop that doesn't
make backtracking records and throws away the internal backtracking
records of the (?:...) at the end of each iteration.  Whatever's inside
the (?:...) doesn't need to be compiled any differently from usual.
Although, in fact, in this case one can impute a (?>...) around it,
and so pass the non-backtracking context inwards.

There are more complicated cases.  Generally, in /(?>(?:...){N,M})/, the
(?:...) needs to keep its backtracking records for the first N iterations.
After N iterations are complete, we know that we'll never backtrack
through there, so all those backtracking records can be discarded, and
each subsequent iteration's backtracking records can be discarded at
the end of the iteration.  /(?>(?:...)*\n)/ could conditionally discard
backtracking records at the end of an iteration based on whether the
next character is a newline.

>What id rather see happen is that perl does on the fly DFA
>construction when it makes sense to do so.

Interesting idea.  Presumably it would be applied where heuristics judge
that it's sensible; an important complication then is to ensure that it
provides exactly the same semantics as the NDFA.  It doesn't solve the
iteration limit problem, on its own.

-zefram

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About