develooper Front page | perl.perl5.porters | Postings from March 2013

[perl #66762] Regex search time varies wildly with input

Thread Previous | Thread Next
From:
James E Keenan via RT
Date:
March 1, 2013 02:35
Subject:
[perl #66762] Regex search time varies wildly with input
Message ID:
rt-3.6.HEAD-28177-1362105341-1127.66762-15-0@perl.org
On Wed Feb 27 23:16:50 2013, demerphq wrote:
> On 27 February 2013 00:46, Andrew Daviel via RT
> <perlbug-followup@perl.org> wrote:
> > I must have missed Bram's reply and then forgotten about this.
> >
> > When I re-try my case #2 with Time::HiRes, now at least I get better
> > results from Perl 5.10 than from Perl 5.8
> >
> > match                           Perl 5.8     5.10
> > /(.*[\d]+).*radon/               0.72 us     0.51 us
> > /(.*[\d]+).*radon/i           5345.95 us    48.1 us
> > /.*[\d]+.*radon/i               77.23 us    45.16 us
> > /.*[\d]+.*(?i:radon)/           55.63 us    45.06 us
> > /(.*[\d]+).* radon/i          5483.97 us    51.47 us
> > /(.*[\d]+).* (?i:radon)/      5285.52 us  4432.8 us
> > /(.*[\d]+).* foobar.* radon/  1428.14 us  1408.74 us
> > /(.*[\d]+).* foobar.* radon/i 4534.89 us    70.32 us
> >
> > If I lowercase the input before matching, it's faster:
> > s/radon/radon/i ; /.*[\d]+.* radon/  1.61 us   1.72 us
> >
> > I still find it hard to see which patterns are going to be slow
> 
> The optimizer has many optimizations, depending on the constructs it
> can sometimes simplify things.
> 
> Note that A.*B is almost always a crap pattern. If you can change the
> .* to something more specific you usually improve performance.
> 
> A.*B.*C is an even worse pattern. Same advice as above.
> 
> If the pattern contains a long fixed string at a useful position in
> the pattern then it can use that to both fail fast, and to bound the
> search space. Case insensitive text will often not be useful to this
> optimization.
> 
> Capturing text will slow things down.
> 
> Anyway, please use blead to compare with, we aren't going to change
> the 5.10 regex engine.
> 
> Yves
> 
> 
> 

There's no evidence of a bug in Perl here; it's mostly a Perl regex
usage question best handled in other forums.  So I'm closing this
ticket.  (If anyone has evidence of regression in regex performance
between 5.16 or blead and earlier versions, that should be the subject
of a new ticket.)

Thank you very much.
Jim Keenan

---
via perlbug:  queue: perl5 status: open
https://rt.perl.org:443/rt3/Ticket/Display.html?id=66762

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About