On Sun, 18 Sep 2011, Father Chrysostomos via RT wrote: > So you are saying that (?(condition)foo(*THEN)bar|baz) should jump out > of the conditional group (since the |baz part is not a backtracking > point, but is only reached when the condition is false), but that > (?(condition)foo(*THEN)bar) should fail the whole pattern (there being > no |bar)? No, I'm not. > I think it ends up being too confusing. The | in a conditional has > nothing to do with regular alternation. That view is reinforced by the > fact that only one pipe is permitted: That is certainly true, but to me, as a simple-minded person, it *looks* like a regular alternation. The only difference is that the matching engine just tries one of the alternatives rather than both. Consider /^.*?(?(?=a)a(*THEN)b|c)/ Pattern ac Subject It starts off trying with zero matches of "a". The condition is true, so it matches a, fails on b, and then backtracks to (*THEN). In a "normal" group it would try the next alternative, but a conditional group behaves as if there is only one alternative, so it should just backtrack as if the group had failed, thereby trying again with one "a" matching .* and so eventually succeeding. I think the same should happen for this example: /^.*?(?(?=a)a(*THEN)b)c/ ac Further investigation shows up another issue. If (*THEN) appears in a regular (non-conditional) group that has no alternatives, its effect again extends beyond the group. /^.*?(a(*THEN)b)c/ aabc Perl gives "no match"; PCRE currently matches. However, if we give it a dummy alternative: /^.*?(a(*THEN)b|z)c/ then Perl (5.012003) does match. That seems very counter-intuitive to me. Perhaps, however this does tie in with the way Perl handles conditional groups, since they seem to have the same behaviour. The text in perlre for *THEN says "when backtracked into on failure, it causes the regex engine to try the next alternation in the innermost enclosing group". It doesn't say what happens if there are no alternations or indeed if *THEN occurs in the final alternation. A check with /^.*?(z|a(*THEN)b)c/ shows that Perl does match in this case too. > > While thinking about this and experimenting, I've just discovered > > another oddity of (*THEN). > > > > Pattern: /a+?(*THEN)c/ > > Subject: aaac > > Result: Perl 5.012003 matches "aaac" > > That’s strange. In 5.14 it doesn’t match. I don’t know which is worse. I sometimes wonder whether these new backtracking verbs are going to prove more trouble than they are worth. Regards, Philip -- Philip HazelThread Previous | Thread Next