develooper Front page | perl.perl5.porters | Postings from February 2013

Re: [perl #66762] Regex search time varies wildly with input

Thread Previous | Thread Next
From:
demerphq
Date:
February 28, 2013 07:16
Subject:
Re: [perl #66762] Regex search time varies wildly with input
Message ID:
CANgJU+XgwaYwm0cxjiH1BuXrz8a7xUXq1zHCFw3Fp2x6sGntDw@mail.gmail.com
On 27 February 2013 00:46, Andrew Daviel via RT
<perlbug-followup@perl.org> wrote:
> I must have missed Bram's reply and then forgotten about this.
>
> When I re-try my case #2 with Time::HiRes, now at least I get better
> results from Perl 5.10 than from Perl 5.8
>
> match                           Perl 5.8     5.10
> /(.*[\d]+).*radon/               0.72 us     0.51 us
> /(.*[\d]+).*radon/i           5345.95 us    48.1 us
> /.*[\d]+.*radon/i               77.23 us    45.16 us
> /.*[\d]+.*(?i:radon)/           55.63 us    45.06 us
> /(.*[\d]+).* radon/i          5483.97 us    51.47 us
> /(.*[\d]+).* (?i:radon)/      5285.52 us  4432.8 us
> /(.*[\d]+).* foobar.* radon/  1428.14 us  1408.74 us
> /(.*[\d]+).* foobar.* radon/i 4534.89 us    70.32 us
>
> If I lowercase the input before matching, it's faster:
> s/radon/radon/i ; /.*[\d]+.* radon/  1.61 us   1.72 us
>
> I still find it hard to see which patterns are going to be slow

The optimizer has many optimizations, depending on the constructs it
can sometimes simplify things.

Note that A.*B is almost always a crap pattern. If you can change the
.* to something more specific you usually improve performance.

A.*B.*C is an even worse pattern. Same advice as above.

If the pattern contains a long fixed string at a useful position in
the pattern then it can use that to both fail fast, and to bound the
search space. Case insensitive text will often not be useful to this
optimization.

Capturing text will slow things down.

Anyway, please use blead to compare with, we aren't going to change
the 5.10 regex engine.

Yves



-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About