develooper Front page | perl.perl5.porters | Postings from September 2011

Re: [perl #92898] (*THEN) broken inside condition subpattern

Thread Previous | Thread Next
Philip Hazel
September 21, 2011 12:04
Re: [perl #92898] (*THEN) broken inside condition subpattern
Message ID:
On Wed, 21 Sep 2011, Father Chrysostomos via RT wrote:

> >    A (B (*THEN) C)
> >    A (B (*THEN) C | D)
> > 
> > In the first, Perl fails the match without any backtracking if C
> > fails;
> > in the second, it backtracks into A. Or that is what my experiments
> > imply.
> Oddly, I don¬Ęt find that counterintuitive at all.  Do we need three
> versions of prune/then?

Aha! One person's intuition is always another's totally craziness. :-)
I wonder what the percentages each way would be if we surveyed the
general Perl-using population? At least one other person thinks as I 
do, because it was a bug report for PCRE - which was behaving more like 
Perl - that got me into this issue in the first place. 

Is there a forum where we could ask the following question?

   Folks, consider the pattern ^A(B(*THEN)C), where A, B, and C are 
   complex patterns. If matching fails in C, do you expect that
   (a) the entire match should fail, or
   (b) the matching should backtrack into A?  

I will ask this question on the pcre-dev mailing list and see what 
answers (if any) I get. I might try Jeffrey Friedl as well. I will not 
be in the least offended if I am "outvoted".

It seems that you intuitively think that a group without a | is a
different kind of animal to a group that contains a |, whereas I don't.
I just think that a group without a | has "one alternative", maybe
better expressed as "one branch" (since "alternative" implies at least
one of two).

I can, however, understand your logic; I have to say that to me it seems
rather mathematically pedantic (with respect :-).

I'd rather not created yet another version of prune/then! I *thought* I 
understood these verbs. It seemed to me that they provide different 
"strengths" of pruning when backtracked onto, as follows (from weakest 
to strongest):

*THEN fails the current alternation branch, and restarts at the next
alternation in the current group, or fails the whole group if there are
no more alternatives.

*PRUNE fails the current match, but allows an advance to the next 
starting position (unless anchored).

*SKIP is like *PRUNE, but can skip forward more than one character.

*COMMIT fails the entire matching process, not allowing any further 
advance in the subject.

That's fairly straightforward; the issue between us is what constitutes 
"the current group". Having two verbs (one for me, one for you) is just 
a recipe for even more confusion.


Philip Hazel
Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About