develooper Front page | perl.perl5.porters | Postings from September 1999

Re: Regular Expression Bug

From:
Ed Peschko
Date:
September 6, 1999 19:56
Subject:
Re: Regular Expression Bug
Message ID:
37D47ED1.FEF26DC8@csgsystems.com
> Some other idea of Ed: C<*> could notice that it is
> inside (?>), and could do no backtracking at all.

Well, that's what I sort of assumed ?> was when I saw it. Just to dwell
on it a bit, a 'no backtracking' operator would be very useful,
especially in conjunction with \G. I've been playing around with the qr{
... (?p { ... } }; syntax to match things like #ifdef..#endif. Its cool,
but *very* easy to end up inside 'endless' regular expressions. Take
something like:

$qr = qr{
            {
                (?:
                    (?> [^{}]*) |
                    (?p{ $qr })
                )*
            }
        }x;

and the alternation of (?: (?> *) ...| (?p { .. }))* does not work very
well when you lose a trailing ')' off of a 16K block. With a \G in place
at the beginning of a regex - or a literal string plus other occurrences
of (?>), you should be able to scan the string once, see that it doesn't
match the trailing ')' and go on. Example:

$trycatchre =
"(?>try\\s*$curlyblock\\s*catch\\s*$parenblock\\s*$curlyblock)";

or

$trycatchre =
"try(?>\\s*)$curlyblock(?>\\s*)catch(?>\\s*)$parenbock(?>\\s*)$curlyblock)";

if ($code =~ m"\G($trycatchre)"sgc)
{
    print "Catch block! $1\n";
}

This would be very slow without \G, but has the potential of being very
fast as written, both when it fails and when it succeeds.

Ed




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About