develooper Front page | perl.perl5.porters | Postings from September 2011

Re: [perl #92898] (*THEN) broken inside condition subpattern

Thread Next
From:
Philip Hazel
Date:
September 12, 2011 06:23
Subject:
Re: [perl #92898] (*THEN) broken inside condition subpattern
Message ID:
alpine.LNX.2.00.1109121133170.4752@quercite.quercite.com
On Sun, 11 Sep 2011, Father Chrysostomos via RT wrote:

> On Wed Jun 15 11:26:50 2011, ph10@hermes.cam.ac.uk wrote:
> > It seems to me that, if what precedes (*THEN) in a branch matches
> > only a fixed string (no backtracking points), then the behaviour
> > should be exactly the same as if (*THEN) is not present. Here is
> > an example where that is not so:
> > 
> > Pattern:   /^.*?(?(?=a)a|b(*THEN)c)/
> > Subject:   ba
> > Result:    no match
> > 
> > Pattern:   /^.*?(?(?=a)a|bc)/
> > Subject:   ba
> > Result:    matches "ba"
> > 
> > I noticed this because I have just fixed the same bug in PCRE.
> 
> Do the pipes in the (?(...)...) condition expression count as regular
> alternations? It seems they don’t. Should they?

The documentation for THEN says that it tries "the next alternation in
the innermost enclosing group". And it also says "Note that if this
operator is used and NOT inside of an alternation then it acts exactly
like the "(*PRUNE)" operator."

So yes, I guess it all depends on whether or not the branches of a
conditional subpattern count as regular alternations. My feeling is that 
they should, so that all branches behave in the same way in regard to 
(*THEN). That is, if (*THEN) is backtracked onto, it skips any previous 
backtracking points in the current branch, and then fails the branch. If 
there are other branches (in non-conditional groups), or other
backtracking points prior to the current group, they will be activated.

While thinking about this and experimenting, I've just discovered
another oddity of (*THEN).

Pattern: /a+?(*THEN)c/
Subject: aaac
Result:  Perl 5.012003 matches "aaac" 

However, PCRE matches only "ac". The same thing happens with (*PRUNE). 
PCRE is failing the entire match when backtracking onto (*THEN), and 
moving on to the next position in the subject. Perl seems to be 
backtracking to the a+? item. This seems not to be in accordance with 
the documentation for (*PRUNE).

Regards,
Philip

-- 
Philip Hazel
Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About