I pushed this merge commit a couple of days ago. It's fairly self-explanatory. It was originally an attempt to fix intuit-only matches under COW, and grew into a 50 commit monster. commit e82485c19c70d922047c43d035a5e59a7c08ce67 Merge: 8088f39 2bfbe30 Author: David Mitchell <davem@iabyn.com> AuthorDate: Sun Jul 28 14:09:44 2013 +0100 Commit: David Mitchell <davem@iabyn.com> CommitDate: Sun Jul 28 14:09:44 2013 +0100 [MERGE] refactor pp_match(), pp_subst(), regexec() Notionally the regexec engine has a well-defined API. In practice, the caller of regexec() (typically pp_match() or pp_subst()), is required to do a lot of set-up before calling regexec(), and some post-processing afterwards; in particular to handle \G, to handle intuit, and to set up $& correctly after an intuit-only match. The series of commits in this branch refactors the code around these three functions so that all the regex "knowledge" is now contained within regexec() rather than in the calling pp functions. At the same, time the pp functions have been heavily cleaned up and simplified where possible. This reduces the LOC in pp_match() from 305 to 186. The most visible refactorisation changes are that: * the call to intuit is now done from regexec() rather than from pp*; * ditto the setting of $& on intuit-only matches; * all the extra setup for \G is now in a single block of code in regexec(), rather than being distributed haphazardly across all 3 functions; Along the way various things have been improved and bugs have been fixed: * intuit-only matches had been inadvertently disabled when COW was enabled; this now fixed. (An intuit-only match is where intuit finding a suitable start position is sufficient to determine that the pattern has matched, e.g. a fixed string pattern /abc/ without captures); * intuit-only substitutions had never been enabled; they are now; e.g /s/foo/bar/g * formerly, intuit was skipped in the presence of anchored \G; this is no longer the case, so that something like "aaaa" =~ /\G.*xx/ now fails quickly due to the missing "xx"; * the COW code will try to reuse the COW copy SV on subsequent captures on the same regex and string, rather than freeing and reallocating. * substitutions will no longer permit themselves to iterate "backwards", e.g. with s/.(?=.\G)/x/g; * some obscure utf8 issues with s/// have been fixed; * some bugs with \G fixed (and probably new ones added) -- Indomitable in retreat, invincible in advance, insufferable in victory -- Churchill on MontgomeryThread Next