> A couple of days ago someone (mjd?) was wondering if the problem that > threads have with the regex match variables (where the regex variables are > tied to pieces of the code and reflect the last thread that executed it) > occurred with non-threaded code too. Well, it's more that I already knew there was this problem with matching, and I was trying to find out if this other problem, that regexes don't work properly under threads, was actually the same. > Just for chuckles, I decided to try it this afternoon and, as > expected, the problem can be duplicated with a non-threaded perl > build. Witness: Here's a simpler version: sub foo { my $s = shift; return unless $s =~ /(.)/; print "$1"; foo(substr($s, 1)); print "$1"; } foo('ouch'); We'd like this to emit `ouchhcuo', because we would expect the two `print' calls in each invocation of `foo' to each print the same thing. But instead the backreference variables get clobbered by the recursive call to `foo' and you get `ouchhhhh'. Sarathy: > I'm sad to see that it hasn't been fixed in more than a year. Almost two years since I brought it up, and at the time Chip called it a `known limitation'. Here's my example from January 1998: # Given a pattern, return an anonymous function which # checks to see if its argument matches that pattern sub make_matcher { my $pat = shift; sub { my $target = shift; $target =~ /$pat/o; }; } my $a = make_matcher('a'); my $b = make_matcher('b'); print ($a->('aa') ? "matched\n" : "did not match.\n"); #1 print ($b->('bb') ? "matched\n" : "did not match.\n"); #2 print ($b->('aa') ? "matched\n" : "did not match.\n"); #3 You would like for #1 and #2 to match, and for #3 to not match. But instead, #3 matches and #2 does not. You think you are returning two anonymous functions, but they share code, and because the regex that is cached by /o is cached in the shared code, the two functions share the cached regex also. (http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/1998-01/msg02163.html) Most of the discussion occurred in February. The solution here is that instead of storing the cached regex (or the pointer to it) into the op tree, you use an extra layer of indirection. The op tree should have an offset into the pad, and the cached regex is pointed to from the pad. The pad is not shared between threads / recursive subroutine invocations / anonymous functions, so each one gets its own cached regex. Similarly, s/cached regex/backreference variables/. There are a handful of other features that suffer from the same problem. Some of these were discussed in the thread from 1998 (Subject: Shared OPs among closures) This may include stateful scalar operators such as glob() and .. and ... --- Chip and Sarathy spent some time discussing these, but I was not able to figure out who was right. I just wrote a test for glob(), though, and it appears that glob() does exhibit the bad behavior. > Given that this is going to be fixed in one way or another for threaded > perl, do we want to go all the way and fix it so non-threaded perl does > lexical scoping for the match variables too? I think that consensus in 1998 was that it should be fixed by indirecting through the pad. Other notes: Tim Bunce: ``The same issue applies to threads.'' http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/1998-02/msg00101.html