develooper Front page | perl.perl5.porters | Postings from August 2010

Re: Patch to make string-append on win32 100 times faster

Thread Previous | Thread Next
From:
demerphq
Date:
August 21, 2010 02:19
Subject:
Re: Patch to make string-append on win32 100 times faster
Message ID:
AANLkTikuaW1BjF5n1KE7knBJR9aH7fWCvgA0n1Ko_BfF@mail.gmail.com
On 20 August 2010 17:05, demerphq <demerphq@gmail.com> wrote:
> On 20 August 2010 16:53, demerphq <demerphq@gmail.com> wrote:
>> On 16 August 2010 08:19, Jan Dubois <jand@activestate.com> wrote:
>>> On Sun, 15 Aug 2010, Reini Urban wrote:
>>>> Jan Dubois schrieb:
>>>> > On Fri, 30 Jul 2010, Wolfram Humann wrote:
>>>> > The discussion of this change seemed to have stalled, but I see
>>>> > +1 votes from David Golden and Marvin Humphrey, with additional
>>>> > information from Ben Morrow that the patch also helps on FreeBSD
>>>> > (when building without -Dusemymalloc), with nobody voicing
>>>> > any disagreement about applying the patch.
>>>>
>>>> This particular slowdown was only recognized for WIN32 native malloc,
>>>> but not for other platforms with better malloc libs.
>>>
>>> Did you read the paragraph you quoted above?  It explicitly claims that
>>> the slowdown happens on other platforms when using the platform native
>>> malloc.
>>>
>>>> Those platforms are now hurt by using overallocation,
>>>> i.e. need more memory, e.g. with piping.
>>>
>>> Could you provide some evidence for this claim?  The only way a
>>> "better malloc" can prevent this slowdown is by doing some kind
>>> of overallocation itself.
>>
>> This is not correct. Mallocs/reallocs that can merge blocks do not
>> have the performance penalty that this algorithm seeks to work around.
>> The problem here is that the Win32 realloc always copies, and thus
>> extending a block a character at a time becomes exponential. With a
>> realloc that merges blocks and only copies where there is insufficient
>> contiguous blocks does not have this problem.
>
> Ill just note that im not arguing against this patch. Just that
> overallocation is not the only reason that a malloc might not be
> penalized by this change.
>
> One real-world benchmark that people might want to try would be to use
> a routine like this:
>
> sub make_tree {
>  my ($depth) = shift;
>  return int rand 100 unless $depth>0;
>  return [ make_tree($depth-1), make_tree($depth-1) ]
> }
>
> and then use the XS implementation of Data::Dumper to dump the results
>  of make_tree() for various N.
>
> On win32 even modest N will result in the machine essentially hanging.
> On no other OS that I've tried it on is the slowdown as noticeable.
> This was traced to the use of realloc in SV_GROW(). This was the
> analysis that lead to Nicholas' original patch.

Ben, Wolfram, any chance you can try benchmarking this with and
without the new patch?

cheers,
Yves


-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About