On Sat, Mar 29, 2014 at 08:29:46PM -0700, Karl Williamson via RT wrote: > Here are the 5.19.10+ blead numbers on a Linux box: > > Rate regex regex_opt eq > regex 7650757/s -- -1% -84% > regex_opt 7717895/s 1% -- -84% > eq 48383994/s 532% 527% -- > > > And if we change $token to 'read', we get > > Rate regex regex_opt eq > regex 7748530/s -- -2% -62% > regex_opt 7899967/s 2% -- -61% > eq 20146179/s 160% 155% -- I think the important thing is how much slower the trie optimisation is for simple alternations that wouldn't benefit. Comparing a range of perl versions on the same hardware, I get for the original bug report code (where $token matches the first alternation): 5.008009 regex 10526316/s 5.010001 regex 3676471/s 5.012005 regex 3367003/s 5.014004 regex 5665722/s 5.016003 regex 6756757/s 5.018002 regex 5602241/s 5.019010 regex 6211180/s And here's the same with TRIEs disabled (${^RE_TRIE_MAXBUF} = -1): 5.008009 regex 10309278/s 5.010001 regex 7017544/s 5.012005 regex 6993007/s 5.014004 regex 6920415/s 5.016003 regex 8130081/s 5.018002 regex 6451613/s 5.019010 regex 6644518/s and with TRIEs enabled and $token as 'read', i.e. $token matches the third alternation: 5.008009 regex 8438819/s 5.010001 regex 3809524/s 5.012005 regex 3344482/s 5.014004 regex 5917160/s 5.016003 regex 6711409/s 5.018002 regex 5524862/s 5.019010 regex 6097561/s and with TRIEs enabled and with Yves modification, i.e. $token matches the 12th alternation: 5.008009 regex 4807692/s 5.010001 regex 3724395/s 5.012005 regex 3367003/s 5.014004 regex 5952381/s 5.016003 regex 6688963/s 5.018002 regex 5555556/s 5.019010 regex 6042296/s What I conclude from the above is that: * as with all such micro benchmarks, a certain amount of the variation is just due to the luck of the compiler, e.g. instruction cache alignment; * simple alternations slow down in proportion to the number of alternatives that are tried (as you would expect); * TRIEs match in roughly constant time regardless of which alternation * matches (as you would expect); * even non-TRIE alternations are slower in later perls, most likely due to all the extra overhead we've accumulated, such as utf8 correctness, not blowing the stack on recursion etc. * TRIEs have got faster in 5.14; * But TRIEs still have a bigger overhead than plain alts, so it may be worth looking at * profiling to see if the overhead can be reduced; * seeing whether there's a cut-off point where it's not worth building a TRIE, e.g. if there are less than N alternatives; and there might be other factors such as lengths of the alt strings, whether its utf8 etc. -- The warp engines start playing up a bit, but seem to sort themselves out after a while without any intervention from boy genius Wesley Crusher. -- Things That Never Happen in "Star Trek" #17Thread Previous | Thread Next