develooper Front page | perl.perl5.porters | Postings from November 2011

Re: Perl Performance Project?

Nicholas Clark
November 17, 2011 05:54
Re: Perl Performance Project?
Message ID:
On Wed, Nov 16, 2011 at 11:56:53PM -0800, Michael G Schwern wrote:
> On 2011.11.16 11:16 PM, H.Merijn Brand wrote:
> >> * Make a realistic benchmark suite of both performance and memory [4]
> >> * Set up a smoker to the benchmarks and report significant differences
> >>   and performance creeps to p5p, like with tests

I think that at least part of this is what Steffen Schwigon is trying to
do with Benchmark::Perl::Formance

> > I know we're slow on this, but the new setup of Test::Smoke will store
> > all core test run times in the database, so one can select runs for the
> > same machine and compare them over time.
> That's a good start.
> It's hard to tease useful information out of that as the test time is
> monolithic.  It's difficult to know what caused a performance change... which
> test got slower?  Did perl get slower, or did the test change?  How do you
> usefully compare the performance of test runs between different versions of
> Perl when the tests are constantly changing?
> A benchmark suite has to be:
> * Fine grained
> * Repeatable
> * Deterministic (ie. each set of runs produces the same result)
> * Comparable between commits
> * Applicable to real world performance situations
> It should be able automatically answer the questions:
> * What slowed down / sped up?
> * When did it slow down / speed up?
> It should provide the tools to answer:
> * Why did it slow down?
> Unfortunately the test suite can't tell us that.

Agree totally. As I mailed the list about 2 months ago, in relation to
RT #98662

Whilst it's potentially useful to know if the regression tests start
taking significantly different time, I'm still not convinced that they
make a good benchmark suite. They serve different purposes:

  regression tests
  * try to test obscure corner cases
  * focus on one thing in isolation
  * should run as quickly as possible, to avoid programmers getting bored or
  * often end up being dominated by startup time
  benchmarks should
  * focus on common code
  * perform complex behaviour using multiple features
  * should stress things with the scale of data needed to detect real problems
    [such as a change to O(bad) behaviour where previously it was O(acceptable)]
  * unless benchmarking startup time, strive to avoid it influencing the result
  and I don't think it's useful to try to make the *regression* tests pretend
  to be a benchmark.
  I've not looked at it yet, but I'm hoping that Steffen Schwigon's work
  on Benchmark::Perl::Formance is going to produce a more comprehensive
  benchmark than perlbench:

[I got his name wrong in the original. Corrected here. Gah.
I'd better add another drink to my budget for Erlangen ]

[Reading again, I think I might be conflicting on "fine grained". I can
see benefits of both fine grained tests, and bigger tests that might catch
bad interactions between seemingly unrelated features]

I'd love to see more work on this. I fear I can't offer much help other than
moral support and buying drinks at conferences as "thank you"

Nicholas Clark Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About