On Fri, Jan 25, 2013 at 4:17 PM, Ronald J Kimball <rjk@tamias.net> wrote: > On Fri, Jan 25, 2013 at 09:23:54AM -0800, Philip Hazel wrote: > >> My understanding of how (*THEN) works is that the test below should >> match. The perlre page says "...this verb always matches, and when >> backtracked into on failure, it causes the regex engine to try the next >> alternation in the innermost enclosing group (capturing or otherwise) >> that has alternations." Unless I am going mad, the examples below (one a >> normal group, the other an assertion) fulfil the condition. >> >> $ perl -e 'print (("ac" =~ /^(?=ab|ac)/)? "yes\n":"no\n")' >> yes >> $ perl -e 'print (("ac" =~ /^(?=a(*THEN)b|ac)/)? "yes\n":"no\n")' >> no >> >> $ perl -e 'print (("ac" =~ /^(ab|ac)/)? "yes\n":"no\n")' >> yes >> $ perl -e 'print (("ac" =~ /^(a(*THEN)b|ac)/)? "yes\n":"no\n")' >> no > > These work in 5.10.1, but not in 5.14.1. > > These are the only tests involving (*THEN) that expect a successful match, > from t/re/pat_advanced.t: > > { > #Mindnumbingly simple test of (*THEN) > for ("ABC","BAX") { > ok /A (*THEN) X | B (*THEN) C/x, "Simple (*THEN) test"; > } > } > > The key difference seems to be that in your tests, the two alternations > begin with the same character. This appears to be caused by the TRIE optimization (as far as I can tell) $ perl -Mre=debug -e'print (("ac" =~ /^(a(*THEN)b|ac)/)? "yes\n":"no\n")' Compiling REx "^(a(*THEN)b|ac)" Final program: 1: BOL (2) 2: OPEN1 (4) 4: TRIE-EXACT[a] (14) <a> (7) 7: CUTGROUP (9) 9: EXACT <b> (14) <ac> (14) 14: CLOSE1 (16) 16: END (0) anchored(BOL) minlen 2 Matching REx "^(a(*THEN)b|ac)" against "ac" 0 <> <ac> | 1:BOL(2) 0 <> <ac> | 2:OPEN1(4) 0 <> <ac> | 4:TRIE-EXACT[a](14) 0 <> <ac> | State: 1 Accepted: N Charid: 1 CP: 61 After State: 2 1 <a> <c> | State: 2 Accepted: Y Charid: 2 CP: 63 After State: 3 2 <ac> <> | State: 3 Accepted: Y Charid: 0 CP: 0 After State: 0 got 2 possible matches TRIE matched word #1, continuing 1 <a> <c> | 7: CUTGROUP(9) 1 <a> <c> | 9: EXACT <b>(14) failed... failed... Match failed no Freeing REx: "^(a(*THEN)b|ac)" This fails in the exact same manner: $ perl -Mre=debug -e'print (("ac" =~ /^((?:a(*THEN)b)|ac)/)? "yes\n":"no\n")' This succeeds: $ perl -Mre=debug -e'print (("ac" =~ /^((a(*THEN)b)|ac)/)? "yes\n":"no\n")' Compiling REx "^((a(*THEN)b)|ac)" Final program: 1: BOL (2) 2: OPEN1 (4) 4: BRANCH (15) 5: OPEN2 (7) 7: EXACT <a> (9) 9: CUTGROUP (11) 11: EXACT <b> (13) 13: CLOSE2 (18) 15: BRANCH (FAIL) 16: EXACT <ac> (18) 18: CLOSE1 (20) 20: END (0) anchored(BOL) minlen 2 Matching REx "^((a(*THEN)b)|ac)" against "ac" 0 <> <ac> | 1:BOL(2) 0 <> <ac> | 2:OPEN1(4) 0 <> <ac> | 4:BRANCH(15) 0 <> <ac> | 5: OPEN2(7) 0 <> <ac> | 7: EXACT <a>(9) 1 <a> <c> | 9: CUTGROUP(11) 1 <a> <c> | 11: EXACT <b>(13) failed... failed... 0 <> <ac> | 15:BRANCH(18) 0 <> <ac> | 16: EXACT <ac>(18) 2 <ac> <> | 18: CLOSE1(20) 2 <ac> <> | 20: END(0) Match successful! yes Freeing REx: "^((a(*THEN)b)|ac)"Thread Previous