develooper Front page | perl.perl5.porters | Postings from January 2003

benchmarking - it's now all(-1,0,1,5,6)% faster

Thread Next
Nicholas Clark
January 11, 2003 11:13
benchmarking - it's now all(-1,0,1,5,6)% faster
Message ID:
This is intentionally a crosspost. One of parrot's aims is to go faster than
perl 5. Meanwhile, I've been trying to make perl 5 go faster. To achieve
either goal, we need measure "faster". I'm having problems measuring
"faster".  Well, I'm stuck. And unless we have a good plan for how to
measure representative average speeds of perl 5 and parrot for representative
tasks, I can't see how we can tune parrot to be faster than perl 5, or perl
5.10 to be faster than 5.8

The story so far:

I'm trying a couple of things, involving inlining one of perl 5's functions,
and applying copy on write to results capture in regexps, to see if it makes
perl go faster. For want of anything better, I'm using the "perlbench"
suite to measure the speed of my patched version. perlbench contains 20
small programs that do representative things, and times them each of them. You
give it several versions of perl to benchmark, and it tells you the relative
timings. So far so good.

I was getting about 5% speedups on penfold against vanilla development perl.
Penfold is an x86 box (actually a Citrix chip, which may be important) running
Debian unstable, with gcc 3.2.1 and 256M of RAM.

I tried the same tests on mirth, a ppc box, again Debian unstable, gcc 3.2.1,
but 128M of RAM. This time I saw 1% slowdowns.

So I tried the same tests on colon, PIII, 1G of RAM, but FreeBSD and gcc 2.95.
There I see 0% or 1% speedup.

But I see some very strange lines. For example on colon (reformatted):

                        A    L    B    K    C    J    D    H    E    G    F    I
                      ---  ---  ---  ---  ---  ---  ---  ---  ---  ---  ---  ---

array/sort-num        100  100  106  106   51   51  109  109  104  105   52   52

A,L are vanilla patchlevel 18142, B,K have the same patching, C,J have the
same patching as each other, etc.

The difference between E,G and F,I is only 1 thing; for E,G I have this macro:

#define RX_MATCH_COPY_FREE(rx) \
	    Safefree(rx->subbeg); \
	    RX_MATCH_COPIED_off(rx); \

whereas for F,I it does slightly more:

#define RX_MATCH_COPY_FREE(rx) \
	STMT_START {if (rx->saved_copy) { \
	    SV_CHECK_THINKFIRST_COW_DROP(rx->saved_copy); \
	} \
	if (RX_MATCH_COPIED(rx)) { \
	    Safefree(rx->subbeg); \
	    RX_MATCH_COPIED_off(rx); \

(C, J have considerably more differences than that the above, so it's not
easy to describe what they do. The other binaries also has changes to
code and differing compiler flags, so it's not easy to summarise)

RX_MATCH_COPY_FREE is used exactly 4 times, where regexps need to free saved
matches, and only inside the regexp engine.

(If you're confused it's a macro I created for the existing 3 lines:
    if (RX_MATCH_COPIED(rx))

yet somehow that simple change makes the sort-num test run at half speed.

The entire sort_num.t file is this:


# Name: Array sorting
# Require: 4
# Desc:

require '';

@a = (1..200);
push(@b, splice(@a, rand(@a), 1)) while @a;  # shuffle

&runtest(0.3, <<'ENDTEST');

    @a = sort {$a <=> $b } @b;


The part that actually oops is the 1 line @a = sort {$a <=> $b } @b;

It goes nowhere near any regexp code or regexp ops. contains
no regexps, but does load Time::HiRes if it can, which in turn will bring
in Exporter and DynaLoader (which do have regexps). But there are no regexps
in the timing loop.

I see similar wild fluctuations in the index tests, which also have no regexps:

require '';

$a = "xx" x 100;
$b = "foobar";
$c = "xxx";

&runtest(15, <<'ENDTEST');

   $c = index($a, $b);
   $c = index($a, $c);

   $c = index($a, $b);
   $c = index($a, $c);


So I'm confused. It looks like some bits of perl are incredibly sensitive to
cache alignment, or something similar. And as a consequence, perlbench is
reliably reporting wildly varying timings because of this, and because it
only tries a few, very specific things. Does this mean that it's still useful?
I'm not convinced that the real, average, performance of a perl binary varies
this wildly. And if it doesn't vary this much, but perlbench does vary this
much, what sort of tasks should be used for quantitative benchmarking of
perl and parrot code? Because I fear that if we don't have benchmarks to aim
to improve on, we're shooting in the dark when it comes to improving

Nicholas Clark

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About