develooper Front page | perl.perl5.porters | Postings from May 2012

Re: [perl #32331] grep {/PATTERN/} is slow

Thread Previous
From:
Abigail
Date:
May 1, 2012 04:46
Subject:
Re: [perl #32331] grep {/PATTERN/} is slow
Message ID:
20120501114752.GB12243@almanda
On Tue, May 01, 2012 at 04:14:12AM -0700, Dave Mitchell via RT wrote:
> On Tue, May 01, 2012 at 10:48:53AM +0000, Ed Avis wrote:
> > Some say that
> > 
> >     grep { /PATTERN } @x
> > 
> > is better style than
> > 
> >     grep /PATTERN/, @x
> > 
> > (Perl Best Practices and perlcritic BuiltinFunctions::RequireBlockGrep)
> > 
> > To keep these people happy it would be good to put in an optimization for the
> > former code to keep it just as fast as the latter.
> > 
> > If speeding it up is a WONTFIX, then perhaps perlcritic's rules need to change.
> 
> It definitely isn't yet a WONTFIX: there's something specifically about
> using a regex (as opposed to just == say) that makes it *much* more slower
> in a block than would be expected.

Note that the slowdown isn't a "regexp in a block". It's a
"regexp in a grep block". As the original bug report shows, the
difference between for-expr and for-block are much less (with 5.15.9,
the difference between for-expr and for-block is less than 10% on
my machine).

With map, I get an inbetween value (using 5.15.9):

   use Benchmark qw /cmpthese/;

   our @array = 1 .. 200;

   cmpthese -2 => {
       map_re_blck => '$::d1 = map {/^50$/ ? $_ : ()} @::array',
       map_re_expr => '$::d2 = map  /^50$/ ? $_ : (), @::array',
   };

   __END__
                   Rate map_re_blck map_re_expr
   map_re_blck  8756/s          --        -39%
   map_re_expr 14422/s         65%          --


Also, if the matches are failing, the difference between grep-block
and grep-expr is bigger (using 5.15.9):

   use Benchmark qw /cmpthese/;

   our @array1 = 5000 .. 5099;
   our @array2 = 4000 .. 4099;

   cmpthese -2 => {
       grep_re_blck1 => '$::d1 = grep {/^50/} @::array1',
       grep_re_expr1 => '$::d2 = grep  /^50/, @::array1',
       grep_re_blck2 => '$::d1 = grep {/^50/} @::array2',
       grep_re_expr2 => '$::d2 = grep  /^50/, @::array2',
   };

   __END__
                    Rate grep_re_blck1 grep_re_blck2 grep_re_expr1 grep_re_expr2
   grep_re_blck1  5879/s            --          -32%          -51%          -81%
   grep_re_blck2  8635/s           47%            --          -28%          -72%
   grep_re_expr1 12041/s          105%           39%            --          -61%
   grep_re_expr2 30629/s          421%          255%          154%            --

> 
> Until the reason is diagnosed, we can't really decide what to do about
> it (but I'm not intending to look at it just yet).
> 


The bug is quite old (I've known about it for longer than the bug report
is old -- it is as least as old as 5.005), so there's no urgency. OTOH,
"grep {/PATTERN/}" is a often used idiom.



Abigail

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About