develooper Front page | perl.perl5.porters | Postings from September 1999

Re: Regular Expression Bug

Thread Next
From:
horos
Date:
September 7, 1999 19:36
Subject:
Re: Regular Expression Bug
Message ID:
199909080234.TAA00326@earth.he.net
> $brackets = qr{
>             {
>                 (?:
>                     (?> [^{}]*) |
>                     (?p{ $qr })
>                 )*
>             }
>         }x;
> 
> and the alternation of (?: (?> *) ...| (?p { .. }))* does not work very
> well when you lose a trailing ')' off of a 16K block. With a \G in place
> at the beginning of a regex - or a literal string plus other occurrences
> of (?>), you should be able to scan the string once, see that it doesn't
> match the trailing ')' and go on.

> I do not follow this \G stuff.  Where do you want to put it?

Well, take the above regular expression. If there was a 'no backtracking' 
operator, you could match the following relatively quickly:

"{this{ fails { very } slowly } without no backtrack op" =~ m"\G$brackets"; 

because you would have to only go through the rest of the string once to find
your bug. With the ?: above, it gets caught in the (.*)* problem (I think ).

Come to think of it, perhaps 'all or nothing' is a better term for the operator
- ie, if something fails to match, the engine rewinds all the way back
to the beginning, bumps up its position by 1 from the beginning point
and then starts again.


So matching the above without the \G modifier would look like:

tries '{.... (goes to end without success)
tries 't' (fails immediately)
tries 'h' (fails immediately)
tries 'i' (fails immediately)
tries 's' (fails immediately)
tries '{........ slowly } (succeeds.)

A complete failure here would 'quick' too - with a bound of O(n^2) for something
like:

'{{{{{{{{{{{{'

> 
> $trycatchre =
> "(?>try\\s*$curlyblock\\s*catch\\s*$parenblock\\s*$curlyblock)";
>
> Please do not use qq"" to deal with RExen.

Good point. I'm so used to the metaphor, but qr { } is much cleaner:

$trycatchre = qw { 
                   (?> try\s*
                           $curlyblock\s*
                       catch\s* $parenblock \s*
                           $curlyblock
                   )
               }x;
            
Ed

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About