Front page | perl.perl5.porters |
Postings from January 2012
Re: whither study()?
Thread Previous
|
Thread Next
From:
Andy Dougherty
Date:
January 31, 2012 08:02
Subject:
Re: whither study()?
Message ID:
alpine.DEB.2.00.1201311052320.21742@fractal.phys.lafayette.edu
On Tue, 31 Jan 2012, demerphq wrote:
> On 31 January 2012 14:52, Andy Dougherty <doughera@lafayette.edu> wrote:
> > On Mon, 30 Jan 2012, Ricardo Signes wrote:
> >
> >> * demerphq <demerphq@gmail.com> [2012-01-29T04:41:33]
> >> > Does anybody have any examples where it actually makes a difference?
> >>
> >> I second that question, but I only care if the difference is the kind of thing
> >> we want to keep around. ;)
> >
> > Yes, I've used it, and yes it has typically made a difference (around 5%
> > last several times I benchmarked it). However, I've only used it for
> > simple patterns of straight ASCII text. I haven't run into any corner
> > cases or subtle bugs, but I haven't stressed it too much either.
>
> If you have a case you can share I would really like to see it. My
> thinking is that other strategies might provide better results.
metaconfig (used to generate perl's Configure) uses study to some
advantage. Without study, a metaconfig run takes 58s on my system. With
study, it only takes 48s.
metaconfig makes a list of patterns (symbols it knows about) and looks
through every file in the perl distribution looking for each of those
symbols. Abstracting what it does a bit, I've used the following program
over the past many years to track the use of study. I append it here for
whatever it's worth. Making the different patterns truly different would
make a fairer test, but this is what I cooked up all those years ago.
#!/usr/local/bin/perl
# undef $/;
$search = "while (<>) { \n";
$search .= "study;\n";
$patt = "abcdefghijklmnopqrstuvwxyz0000";
for ($i = 0; $i < 250; $i++) {
$search .= 'print "$ARGV: $_" if ' . "/$patt/;\n";
$patt++;
}
$search .= "}\n";
# print $search;
eval $search;
__END__
I tried both 4.036 and 5.000, both with and without the study.
I also tried adding an undef $/; to the beginning of the program.
(metaconfig uses over 500 patterns, but I ran out of patience:-)
A typical command line is (in the perl5.000 directory)
time perl4.036 try.study *.c *.h
Here are the results:
Perl Study? Slurp? user time (sec)
__________________________________________________________
perl4.036 No No 344
perl4.036 Yes No 220
perl5.000 No No 680
perl5.000 Yes No 675
__________________________________________________________
perl4.036 No Yes 25
perl4.036 Yes Yes 8
perl5.000 No Yes 26
perl5.000 Yes Yes 26
Conclusion:
These differences in performance on a basic pattern extraction problem are
a bit surprising. It's especially puzzling that the study doesn't
seem to buy much for perl5.000.
Yes, I realize that slurping in paragraphs or whole files runs much faster
-- unless you run out of memory, in which case it won't run at all:-(. In
the interest of avoiding arbitrary limits, I usually use the default
line-at-a-time style -- that way some critical job won't bomb the night
before a presentation with "Out of memory!".
Yes, I also realize that perl4.036 is optimized, while perl5.000 is
generally not. Still, I hope it's helpful to identify places where it's
worth the effort to optimize.
Andy Dougherty doughera@lafcol.lafayette.edu
Update: June 23, 2006
Perl Study? user time (sec)
perl5.8.4-thread-multi Yes 57
perl5.8.4-thread-multi No 60
Update: May 2, 20111
time ./perl `awk '{print $1}' MANIFEST` > /dev/null
Perl Study? Slurp? user time (sec)
__________________________________________________________
perl5.14.0-RC1 No No 46.8
perl5.14.0-RC1 Yes No 44.5
__________________________________________________________
perl5.14.0-RC1 No Yes 3.23
perl5.14.0-RC1 Yes Yes 0.66
Thread Previous
|
Thread Next