Front page | perl.perl5.porters |
Postings from September 2010
Re: All gone
From: Reini Urban
September 17, 2010 00:25
Re: All gone
Message ID: AANLkTi=CtQWFojk5Ndn9F_VXNCg4eC6wZbhxwzshTj2R@mail.gmail.com
2010/9/16 Steffen Mueller <email@example.com>:
> Hi Andreas, hi Reini,
> (Andreas J. Koenig) wrote:
>>>>>>> On Mon, 13 Sep 2010 23:14:34 +0200, Reini Urban <firstname.lastname@example.org>
>> Executive summary: Do not dismiss findings just because you luckily
>> cannot reproduce them. Reini is right, his benchmark shows quite a
>> drastic slowdown and is reproducable.
>> Since Steffen had doubts about the methodology I measured again,
> NB: I had doubts about the methodology of the original benchmark as I have
> doubts about the methodology of virtually every benchmark.
>> combining Reini's oneliner with Steffen's dumbbench and except taking
>> longer the results are practically the same.
> The win from taking longer is that it provides you with an estimate of the
> variability of the results. This is extremely important when trying to
> interpret the results.
> This is more or less the same thing as running a benchmark multiple times
> manually to try to get an idea of how stable the results are. Except the
> script tries to do that quantitatively and the experimenter who runs things
> multiple times does a qualitative assessment which depends a lot on his
> experience. And most of the time, people don't even show that they went
> through this exercise. Thus my motivation to automate this important
> validation process.
>> Pasting the results into the table I sent on Monday. "time" denotes the
>> time of Reini's test which I ran 5 times and took the minimum. SMtime is
>> run with Steffen's dumbbench.
>> The one revision that made perl roughly 10% slower on Reini's test was
>> the malloc patch that was speeding up malloc on Windows by 100 times or
>> so (v5.13.3-207-gf120055).
> A day after I posted my reply, I re-ran the benchmark as well and I could
> reproduce the slowdown. Since I didn't understand the differing result at
> the time, I planned to take a closer look at night and... didn't get to it.
> Now, I'm sorry I didn't reply immediately saying so.
> Since then, I spent some time studying the underlying distribution of
> timings in small benchmarks and it's quite horrifying from a statistics
> point of view. While I consider this topic highly interesting, I suspect
> most on this list don't, so I won't go into details. I'll try to come up
> with a more reliable tool for estimating the variability of benchmark
> results, but it's not clear whether that's going to be possible.
> Getting back on topic: Even with Reini's benchmark showing a slowdown, it's
> entirely not clear that overall in actual, real production code, there'd be
> a slowdown or even an improvement in run-time. Previously, we were shown it
> was possible to produce micro-benchmarks showing that the change doesn't
> adversely affect them. Now we have one that shows that there are occasions
> in which the patch does have a net negative effect. We will have to
> investigate more to arrive at a reasonable conclusion. Ideally with more
> realistic code samples.
I was testing some basic basic core features in a long ackerman recursion
- which gives the longest run-time in shortest code, as every
mixed with some typical perl features, excluding IO which would be too
With the long run-time of 10 seconds I can exclude jitter, cpu
or other simple disturbances, we don't have to do statistics and the
is rather fair. Even with a warm run before it would lead to the same result.
And btw. perl is not typically used with warm runs, perl has to be
fast on cold starts also.
Since we know now - thanks Andy, your machine is a real goodie -
that the problem is the realloc patch which I objected to, based
on my feelings and could not understand the benchmarks which led to this commit
(and still don't), I could add some more array/string functionality to
Now I only see pad_alloc getting slower (a few lexicals being added),
but not the more general case of strings or arrays getting bigger or smaller,
which would lead to a more dramatic slowdown - just guessing.
But I rather would like to get the new grow function #ifdef's for win32 only as
proposed when I saw it.
Back to benching:
A better one-liner would need a better distribution of ~80% of all ops
too many external dependencies, but getting this kind of coverage is a tough
task for a real hacker.
We even don't have the run-time pp_ call stats and pp_ coverage of it now.
I just calculated the compile-time stats once.
Testing the run-time of t/* is a good measure of that, I guess.