On Thu, Mar 31, 2011 at 10:15:55AM +0200, demerphq wrote: > On 24 July 2010 17:46, Dave Mitchell <davem@iabyn.com> wrote: > > The original of this bug report didn't make it to the p5p mailing > > list; presumably because the original test script included 0.5Mb of sample > > HTML. The following demonstrates the same slowdown while generating its > > own sample HTMl data: > > > > #!/usr/bin/perl > > use Time::HiRes qw( time ); > > > > my $html = qq{<div class="boo">\n} x 30_000; > > > > sub try { > > my ($re) = @_; > > my $t = time; > > $html =~ /$re/; > > warn sprintf "%7.4f sec, regexp is %s\n", time-$t, $re; > > } > > > > try(qr{<div class="marketsort[^>]*(?-i:>)\s*}ms); > > try(qr{<div class="marketsort[^>]*(?:>)\s*}ims); > > try(qr{<div class="marketsort[^>]*(?-i:>)}ims); > > try(qr{<div\sclass="marketsort[^>]*(?-i:>)\s*}ims); > > try(qr{<div class="marketsort[^>]*(?-i:>)\s*}ims); > > warn "FINISHED\n"; > > > > which on blead gives: > > > > 0.0004 sec, regexp is (?ms-xi:<div class="marketsort[^>]*(?-i:>)\s*) > > 0.0041 sec, regexp is (?msi-x:<div class="marketsort[^>]*(?:>)\s*) > > 0.0048 sec, regexp is (?msi-x:<div class="marketsort[^>]*(?-i:>)) > > 0.0163 sec, regexp is (?msi-x:<div\sclass="marketsort[^>]*(?-i:>)\s*) > > 61.5914 sec, regexp is (?msi-x:<div class="marketsort[^>]*(?-i:>)\s*) > > FINISHED > > > > The time of the last pattern is quadratic on RHS of 'x' in the $html > > assignment. > > > > I haven't looked into any further than that. > > We never fixed the superlinear cache bug that was broken some time > around when the engine was derecursivized. > > Id bet a dollar this is related. You owe me one dollar ;-) The superlinear cache is only used with CURLYX/WHILEM op pairs, which aren't generated with these particular patterns. Perhaps a case could be made for extending it for other ops as well? (Off the top of my head I don't how viable that is.) The particular bug with the superlinear cache was that, until the derecursivisation [why is my spell-checker complaining???], the cache cached both success and failure positions; afterwards it only cached failure positions. At the time I couldn't think of a case that actually exercised the positive cache; eventually someone found one, but it was very obscure and was deemed not worth worrying about (it involved negative lookahead assertions IIRC). -- Any [programming] language that doesn't occasionally surprise the novice will pay for it by continually surprising the expert. -- Larry WallThread Previous | Thread Next