Per Lindquist <per.lindquist@dynamics.saab.se> wrote > Hi, I'm not sure if this should be called a bug but it's a problem > anyway. A 20x slowdown is definitely a bug. In the past, proposed Perl upgrades have been rejected because they caused a 3% overhead. > The 's///' statement seems to take a lot more time in version 5.6.0 than > in version 5.00403, at least in some cases. The execution time has > increased more than 20 times (an in-house script takes 20 minutes > instead of one, which is quite annoying). I also observe this, on several platforms. Here's a cut down version of your test case which shows the effect: use re 'debug'; $_ = "P FIRST = 0\nP " . join(" = 0\nP ", 1..5) . " = 0\nP LAST = 0\n"; /(.*)LAST(.*=).*/; It is (reasonably) fast on perl5.005_03, but slow on perl5.6.0 and indeed with the latest development perl-current. Note that your regular expression is rather inefficient (but that doesn't excuse any bug). The sequence of events will be: Starting at the beginning of the string, it will match 'P FIRST = 0' against '.*', then'll be stopped by the "\n". It'll then fail to match 'LAST', so will backtrack to match 'P FIRST =' against '.*', fail, ... until it's matched '' against '.*'. Now it's failed to match at the start of the string, so it'll move forward to match ' FIRST = 0' against '.*', fail and backtrack as previously ..., trying 'FIRST = 0' ... . So it's doing a quadratic amount of work before even getting to the first newline. Now it'll repeat this searching for each 'P MIDDLE = 0' in turn, till it eventually finds the 'P LAST = 0' and succeeds. Whew! You can reduce this quadratic threshing to linear by restricting the '.*' matches to immediately after newlines, by adding a ^ anchor and adding the /m flag: $file =~ s/^(.*)LAST(.*=).*/$1LAST$2foo;/m ^ ^ This bypasses the bug, and is a good thing to do in any case. Also replace the '.*' by somethinmg more restrictive if you know what you are expecting to match. (In the test case, that would be 'P ', but presumably your real program is doing something more complex.) Back to the bug: the debug output shows that in perl5.005_03, the RE engine is indeed behaving as described above. But the output under perl5.6.0 shows that instead of the quadratic rummage over 'P FIRST = 0', it's doing a cubic search. Dunno why. The effect goes away if the first pair of brackets are made non-capturing: /(?:.*)LAST(.*=).*/; I'll leave it to someone else to dig further. Mike GuyThread Previous | Thread Next