develooper Front page | perl.perl5.porters | Postings from August 2010

Re: Patch to make string-append on win32 100 times faster

Thread Previous | Thread Next
From:
Wolfram Humann
Date:
August 17, 2010 11:18
Subject:
Re: Patch to make string-append on win32 100 times faster
Message ID:
4C6AD270.3080103@arcor.de
On 17.08.2010 00:47, Jan Dubois wrote:
> On Mon, 16 Aug 2010, Wolfram Humann wrote:
>    
>> (All on an unpatched perl 5.10.1 with usemymalloc=n and usemymalloc=y.
>> Starting with an empty my @ar2;. Code given is an additional line in my
>> array of tests, but could also run stand-alone)
>>
>> 1) 'append $i' =>  sub{ for my $i (1..1E3){ $ar2[$i] .= $c1E2 for 1..1E3 } }
>> That's the 'good' case: Append 1000 times to one string, move to the
>> next string, append 1000 times to that one, move to the next...
>> GNU malloc:    238.8 ms
>> Perl malloc: 164.8 ms
>>
>> 2) 'append $_' =>  sub{ for my $i (1..1E3){ $ar2[$_] .= $c1E2 for 1..1E3 } }
>> That's the 'bad' case: append to each string once, than again to each
>> string,...
>> GNU malloc:  57143.0 ms
>> Perl malloc: 298.0 ms
>>
>> Jan, if possible, could you run that on your before-and-after-the-patch
>> compiles?
>>      
> Before, GNU alloc:
> 	append $i:   220.4 ms
> 	append $_: 12879.6 ms
>
> Before, Perl alloc:
> 	append $i:  200.4 ms
> 	append $_:  297.6 ms
>
> After, GNU alloc:
> 	append $i:  190.9 ms
> 	append $_:  282.8 ms
>
> After, Perl alloc:
> 	append $i:  198.2 ms
> 	append $_:  269.7 ms
>
> So I guess as expected: GNU malloc improve a lot, Perl malloc is unaffected.
>
> I haven't looked at total memory consumption for these tests, so I can't
> confirm that the memory usage for Perl malloc() is unchanged.  I strongly
> suspect that to be the case though.
>    

I tried to look at memory consumption using Linux::Smaps but my simple 
test case (in an endless loop: append 1024000 chars to a string,  report 
memory usage) shows way too little memory overhead, especially when my 
patch or usemymalloc (which, when I looked at memory use on win32, 
always results in even bigger memory footprints than my patch) is 
active. I have no experience with Linux::Smaps. Is it the wrong tool for 
this purpose? Any ideas what to use instead?

Command line:
perl -MLinux::Smaps -MConfig -E'$u=usemymalloc; say "$u: ",$Config{$u}; 
$m=Linux::Smaps->new; while(1){$s .= "#" for 1..1024000; $m->update; 
printf"$_:%6d, ",$m->$_ for qw(rss shared_clean shared_dirty 
private_clean private_dirty); printf"string:%6d k\n",length($s)/1024} '

First 10 results for each of the four cases:

Before, GNU alloc:
usemymalloc: n
rss:  4052, shared_clean:  1600, shared_dirty:     0, private_clean:     4, private_dirty:  2448, string:  1000 k
rss:  5056, shared_clean:  1600, shared_dirty:     0, private_clean:     8, private_dirty:  3448, string:  2000 k
rss:  6056, shared_clean:  1600, shared_dirty:     0, private_clean:     8, private_dirty:  4448, string:  3000 k
rss:  7056, shared_clean:  1600, shared_dirty:     0, private_clean:     8, private_dirty:  5448, string:  4000 k
rss:  8056, shared_clean:  1600, shared_dirty:     0, private_clean:     8, private_dirty:  6448, string:  5000 k
rss:  9056, shared_clean:  1600, shared_dirty:     0, private_clean:     8, private_dirty:  7448, string:  6000 k
rss: 10056, shared_clean:  1600, shared_dirty:     0, private_clean:     8, private_dirty:  8448, string:  7000 k
rss: 11056, shared_clean:  1600, shared_dirty:     0, private_clean:     8, private_dirty:  9448, string:  8000 k
rss: 12056, shared_clean:  1600, shared_dirty:     0, private_clean:     8, private_dirty: 10448, string:  9000 k
rss: 13056, shared_clean:  1600, shared_dirty:     0, private_clean:     8, private_dirty: 11448, string: 10000 k

Before, Perl alloc:
usemymalloc: y
rss:  3992, shared_clean:   576, shared_dirty:     0, private_clean:  1036, private_dirty:  2380, string:  1000 k
rss:  4996, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty:  3380, string:  2000 k
rss:  5996, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty:  4380, string:  3000 k
rss:  6996, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty:  5380, string:  4000 k
rss:  7996, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty:  6380, string:  5000 k
rss:  8996, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty:  7380, string:  6000 k
rss:  9996, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty:  8380, string:  7000 k
rss: 10996, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty:  9380, string:  8000 k
rss: 11996, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty: 10380, string:  9000 k
rss: 12996, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty: 11380, string: 10000 k

After, GNU alloc:
usemymalloc: n
rss:  4068, shared_clean:   584, shared_dirty:     0, private_clean:  1024, private_dirty:  2460, string:  1000 k
rss:  5072, shared_clean:   584, shared_dirty:     0, private_clean:  1028, private_dirty:  3460, string:  2000 k
rss:  6072, shared_clean:   584, shared_dirty:     0, private_clean:  1028, private_dirty:  4460, string:  3000 k
rss:  7072, shared_clean:   584, shared_dirty:     0, private_clean:  1028, private_dirty:  5460, string:  4000 k
rss:  8072, shared_clean:   584, shared_dirty:     0, private_clean:  1028, private_dirty:  6460, string:  5000 k
rss:  9072, shared_clean:   584, shared_dirty:     0, private_clean:  1028, private_dirty:  7460, string:  6000 k
rss: 10072, shared_clean:   584, shared_dirty:     0, private_clean:  1028, private_dirty:  8460, string:  7000 k
rss: 11072, shared_clean:   584, shared_dirty:     0, private_clean:  1028, private_dirty:  9460, string:  8000 k
rss: 12072, shared_clean:   584, shared_dirty:     0, private_clean:  1028, private_dirty: 10460, string:  9000 k
rss: 13072, shared_clean:   584, shared_dirty:     0, private_clean:  1028, private_dirty: 11460, string: 10000 k

After, Perl alloc:
usemymalloc: y
rss:  4000, shared_clean:   576, shared_dirty:     0, private_clean:  1036, private_dirty:  2388, string:  1000 k
rss:  6036, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty:  4420, string:  2000 k
rss:  7036, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty:  5420, string:  3000 k
rss:  8036, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty:  6420, string:  4000 k
rss:  9036, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty:  7420, string:  5000 k
rss: 10036, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty:  8420, string:  6000 k
rss: 11036, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty:  9420, string:  7000 k
rss: 12036, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty: 10420, string:  8000 k
rss: 13036, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty: 11420, string:  9000 k
rss: 14036, shared_clean:   576, shared_dirty:     0, private_clean:  1040, private_dirty: 12420, string: 10000 k








Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About