develooper Front page | perl.perl5.porters | Postings from July 2014

[perl #122331] Pathological performance of a pattern match

Thread Next
From:
perlbug-followup
Date:
July 19, 2014 21:31
Subject:
[perl #122331] Pathological performance of a pattern match
Message ID:
rt-4.0.18-26880-1405804862-357.122331-75-0@perl.org
# New Ticket Created by  (Andreas J. Koenig) 
# Please include the string:  [perl #122331]
# in the subject line of all future correspondence about this issue. 
# <URL: https://rt.perl.org/Ticket/Display.html?id=122331 >


I think this bugreport deserves to be a quiz. Test your intuition.

Sample string to test the regexps against:

  sprintf "%scould%snot%sopen%s", ("x"x10000)x4;

Regexp 1:

  qr!.*(?:could ?not (?:open|connect|find))!

Regexp 2 (the difference is it's case insensitive):

  qr!.*(?i:could ?not (?:open|connect|find))!

Note: neither matches and this is correct.

Question: which is faster and by how much?

Answer: regexp 1 is much more than 100 times slower than the regexp 2.

On my machine this program takes 1-2 wallclock seconds:

  time $p -le 'my $x = sprintf "%scould%snot%sopen%s", ("x"x10000)x4;print $x =~ m!.*(?:could ?not (?:open|connect|find))! ? "not ok" : "ok";' 

And on all those perls it takes less than 0.01 wallclock seconds when I
add the "i".

Historical evidence: this is not a regression, at least not since 5.6. I
have witnessed the same slowness on all my perls between 5.6.1 and
bleadperl.

Competition evidence: with 'use re::engine::PCRE;' both regexps are
fast.

Enjoy,
-- 
andreas


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About