develooper Front page | perl.perl5.porters | Postings from August 2010

RE: Patch to make string-append on win32 100 times faster

Thread Previous | Thread Next
From:
Jan Dubois
Date:
August 15, 2010 23:19
Subject:
RE: Patch to make string-append on win32 100 times faster
Message ID:
00d201cb3d0a$f9f34c70$edd9e550$@activestate.com
On Sun, 15 Aug 2010, Reini Urban wrote:
> Jan Dubois schrieb:
> > On Fri, 30 Jul 2010, Wolfram Humann wrote:
> > The discussion of this change seemed to have stalled, but I see
> > +1 votes from David Golden and Marvin Humphrey, with additional
> > information from Ben Morrow that the patch also helps on FreeBSD
> > (when building without -Dusemymalloc), with nobody voicing
> > any disagreement about applying the patch.
> 
> This particular slowdown was only recognized for WIN32 native malloc,
> but not for other platforms with better malloc libs.

Did you read the paragraph you quoted above?  It explicitly claims that
the slowdown happens on other platforms when using the platform native
malloc.

> Those platforms are now hurt by using overallocation,
> i.e. need more memory, e.g. with piping.

Could you provide some evidence for this claim?  The only way a
"better malloc" can prevent this slowdown is by doing some kind
of overallocation itself.  Since the algorithm in this patch
is not necessarily cumulative with the overallocation by malloc()
it is very well possible that the change has no effect at all
on systems with a overallocating malloc().

For example, assume Perl is appending 100 bytes to a 1000 bytes string.
The patch under discussion will make sure that sv_grow() will request
1260 bytes (1000 + 1000>>2 + 10) instead of just 1100 bytes from realloc().

Assume the original 1000 bytes were allocated in a 1024 bytes slab in
the allocator, so the 1100 bytes wouldn't fit in anymore, and realloc()
will now move this to e.g. a 1536 byte slab.  In that case it doesn't
make any difference that we now asked for 1260 bytes instead of 1100.

Also note that the minimum growth is based on the old buffer size,
so appending 300 bytes to the 1000 byte string will only request
1300 bytes, because 1300 is already larger than the 1260 minimum
realloc growth.

So until proven otherwise I doubt that this patch has any noticeable
effect on a "good malloc()".  On a "medium malloc()" I would expect
it to improve performance somewhat, at a moderate additional memory
requirement.

> Why was this patch not applied with the appropriate
> #if defined(_WIN32) or what is used for MSVC and mingw?

Because then it wouldn't be applied to other platforms, like FreeBSD.

Note thought that I added this remark for discussion about disabling it
under -Dusemyalloc:

| b) Should the new over-allocation also be used under -Dusemyalloc.  It
|    will provide less benefit there, but also come at a lower cost because
|    the newlen can still fall inside the already allocated bucket size
|    anyways.  I don't have any strong opinion if it should be disabled
|    in this situation or not.

But as I stated above, I don't think it will make much of a difference either
way for the -Dusemyalloc case, but would love to see some comprehensive
benchmarks.

Cheers,
-Jan



Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About