develooper Front page | perl.perl5.porters | Postings from August 2010

Re: Patch to make string-append on win32 100 times faster

Thread Previous | Thread Next
From:
Wolfram Humann
Date:
August 16, 2010 15:39
Subject:
Re: Patch to make string-append on win32 100 times faster
Message ID:
4C69BE30.4090803@arcor.de
On 16.08.2010 21:17, Jan Dubois wrote:
> To provide similar numbers for Linux and GNU malloc I've compiled 
> bleadperl
> just before and after the patch in question, both with usemymalloc=y
> and usemymalloc=n.  The results are just for the "1E7 chars + 1E5 x 
> 1E1 chars"
> benchmark, as that is the slowest of the bunch.  I've run the benchmark
> script 100 times for each Perl build and show the min/max runtimes to 
> show
> that there is quite a bit of noise:
>
> Before, GNU malloc:  Min=38.6 Max=45.4 Avg=41.20
> Before, Perl malloc: Min= 7.9 Max=14.2 Avg=11.14
>
> After, GNU malloc:   Min= 9.7 Max=12.7 Avg=11.45
> After, Perl malloc:  Min= 9.4 Max=13.2 Avg=11.16
>
> It shows that GNU malloc on its own takes 4 times as long as GNU 
> malloc with
> the patch.  GNU malloc with this patch matches the time used by the Perl
> malloc (usemymalloc=y), which doesn't seem to be affected by the patch.

Hm, the differences I get on my Linux go in the same direction but are 
less impressive. On an unpatched perl I get around 12 ms with 
usemymalloc=n and around 8 ms with usemymalloc=y.

But I also tried to follow up on Peter J. Holzer's remarks in 
<http://groups.google.com/group/comp.lang.perl.misc/msg/8747b64794aec27b>. 
He used strace for a closer look at what GNU malloc does and found that 
it leaves free space after allocated memory so that it can often realloc 
without moving  memory. This free space (as far as I understand form 
watching private memory usage reported by Linux::Smaps) is not 
considered part of perl's memory ntil actually used. He also suggests 
that this algorithm -- like most "clever" ones -- can be broken. Looks 
like it can:
(All on an unpatched perl 5.10.1 with usemymalloc=n and usemymalloc=y. 
Starting with an empty my @ar2;. Code given is an additional line in my 
array of tests, but could also run stand-alone)

1) 'append $i' => sub{ for my $i (1..1E3){ $ar2[$i] .= $c1E2 for 1..1E3 } }
That's the 'good' case: Append 1000 times to one string, move to the 
next string, append 1000 times to that one, move to the next...
GNU malloc:    238.8 ms
Perl malloc: 164.8 ms

2) 'append $_' => sub{ for my $i (1..1E3){ $ar2[$_] .= $c1E2 for 1..1E3 } }
That's the 'bad' case: append to each string once, than again to each 
string,...
GNU malloc:  57143.0 ms
Perl malloc: 298.0 ms

Jan, if possible, could you run that on your before-and-after-the-patch 
compiles?

Wolfram






Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About