develooper Front page | perl.perl5.porters | Postings from September 2011

Re: [perl #92898] (*THEN) broken inside condition subpattern

Thread Previous | Thread Next
Philip Hazel
September 21, 2011 02:16
Re: [perl #92898] (*THEN) broken inside condition subpattern
Message ID:
On Tue, 20 Sep 2011, Nicholas Clark wrote:

> About the only possibly useful input I think I can have after reading
> the thread of this bug, and the entire documentation:


> is that I'm sadly thinking that you're right about the trouble/worth trade.
> I don't actually understand any of this. Which isn't a good sign, as based
> on previous experience I'm going to make the possibly arrogant assumption
> that *I* am not the one at fault for the lack of understanding.
> [Nothing uses (*THEN) in the core, other than the 14 lines of tests for it]

I've been thinking about this some more. My naive understanding of *THEN 
is basically this: it is effectively just another way of doing what 
(?>...) does, but with possibly simpler syntax and the added feature of
*THEN:NAME. The emphasis on alternation is really a red herring.
Thus, if you have

   A (*THEN) B
(where A and B are complex patterns) the matching engine, having passed 
(*THEN) and subsequently failed in B, no longer backtracks into A. This 
would be the same:


If you go along with this, it follows that, if (*THEN) is within a 
group, for example,

   C (A (*THEN) B)
then a failure in B must backtrack into C, just as would happen with

   C ((?>A) B)
In other words, the effect of (*THEN) within a group does not propagate 
back beyond the start of the group. IMHO this should also apply to 
conditional groups which, after all, behave sort of like a group with 
only one branch (just that there's a choice of which one each time).

Now, it seems that Perl thinks differently to me. There seems to be the
concept of "group with no alternation" and "group with alternation" as
two different things that are handled differently. (And a conditional
group is of the former type.) This was shown up by my example, where
adding what was in effect a dummy alternative to a group with only one
branch caused a change in behaviour. It is a valid concept, but I humbly
submit that it is very confusing and unexpected. The fact that (*THEN)
in these two examples behaves differently is, to me at least,

   A (B (*THEN) C)
   A (B (*THEN) C | D) 

In the first, Perl fails the match without any backtracking if C fails;
in the second, it backtracks into A. Or that is what my experiments


PS On the matter of value/worth, some of the other backtracking verbs 
behave oddly too, though admittedly with silly examples:

Pattern: (*ACCEPT)a
Subject: bax
Result:  Perl matches "", with "ax" as the remainder of the subject,
         in other words, the match point is after "b".

Pattern: (*ACCEPT)
Subject: bax
Result:  Perl matches "", with "bax" as the remainder of the subject,
         in other words, the match point is at the start.

This is Perl 5.012003.

Philip Hazel

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About