develooper Front page | perl.perl5.porters | Postings from July 2010

Re: [perl #76546] regex engine slowdown bug

Thread Previous | Thread Next
From:
Dave Mitchell
Date:
July 24, 2010 08:46
Subject:
Re: [perl #76546] regex engine slowdown bug
Message ID:
20100724154628.GF2996@iabyn.com
The original of this bug report didn't make it to the p5p mailing
list; presumably because the original test script included 0.5Mb of sample
HTML. The following demonstrates the same slowdown while generating its
own sample HTMl data:

    #!/usr/bin/perl
    use Time::HiRes qw( time );

    my $html = qq{<div class="boo">\n} x 30_000;

    sub try {
	my ($re) = @_;
	my $t = time;
	$html =~ /$re/;
	warn sprintf "%7.4f sec, regexp is %s\n", time-$t, $re;
    }

    try(qr{<div class="marketsort[^>]*(?-i:>)\s*}ms);
    try(qr{<div class="marketsort[^>]*(?:>)\s*}ims);
    try(qr{<div class="marketsort[^>]*(?-i:>)}ims);
    try(qr{<div\sclass="marketsort[^>]*(?-i:>)\s*}ims);
    try(qr{<div class="marketsort[^>]*(?-i:>)\s*}ims);
    warn "FINISHED\n";

which on blead gives:

     0.0004 sec, regexp is (?ms-xi:<div class="marketsort[^>]*(?-i:>)\s*)
     0.0041 sec, regexp is (?msi-x:<div class="marketsort[^>]*(?:>)\s*)
     0.0048 sec, regexp is (?msi-x:<div class="marketsort[^>]*(?-i:>))
     0.0163 sec, regexp is (?msi-x:<div\sclass="marketsort[^>]*(?-i:>)\s*)
    61.5914 sec, regexp is (?msi-x:<div class="marketsort[^>]*(?-i:>)\s*)
    FINISHED

The time of the last pattern is quadratic on RHS of 'x' in the $html
assignment.

I haven't looked into any further than that.



-- 
Red sky at night - gerroff my land!
Red sky at morning - gerroff my land!
    -- old farmers' sayings #14

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About