On 20 August 2010 17:05, demerphq <demerphq@gmail.com> wrote: > On 20 August 2010 16:53, demerphq <demerphq@gmail.com> wrote: >> On 16 August 2010 08:19, Jan Dubois <jand@activestate.com> wrote: >>> On Sun, 15 Aug 2010, Reini Urban wrote: >>>> Jan Dubois schrieb: >>>> > On Fri, 30 Jul 2010, Wolfram Humann wrote: >>>> > The discussion of this change seemed to have stalled, but I see >>>> > +1 votes from David Golden and Marvin Humphrey, with additional >>>> > information from Ben Morrow that the patch also helps on FreeBSD >>>> > (when building without -Dusemymalloc), with nobody voicing >>>> > any disagreement about applying the patch. >>>> >>>> This particular slowdown was only recognized for WIN32 native malloc, >>>> but not for other platforms with better malloc libs. >>> >>> Did you read the paragraph you quoted above? It explicitly claims that >>> the slowdown happens on other platforms when using the platform native >>> malloc. >>> >>>> Those platforms are now hurt by using overallocation, >>>> i.e. need more memory, e.g. with piping. >>> >>> Could you provide some evidence for this claim? The only way a >>> "better malloc" can prevent this slowdown is by doing some kind >>> of overallocation itself. >> >> This is not correct. Mallocs/reallocs that can merge blocks do not >> have the performance penalty that this algorithm seeks to work around. >> The problem here is that the Win32 realloc always copies, and thus >> extending a block a character at a time becomes exponential. With a >> realloc that merges blocks and only copies where there is insufficient >> contiguous blocks does not have this problem. > > Ill just note that im not arguing against this patch. Just that > overallocation is not the only reason that a malloc might not be > penalized by this change. > > One real-world benchmark that people might want to try would be to use > a routine like this: > > sub make_tree { > my ($depth) = shift; > return int rand 100 unless $depth>0; > return [ make_tree($depth-1), make_tree($depth-1) ] > } > > and then use the XS implementation of Data::Dumper to dump the results > of make_tree() for various N. > > On win32 even modest N will result in the machine essentially hanging. > On no other OS that I've tried it on is the slowdown as noticeable. > This was traced to the use of realloc in SV_GROW(). This was the > analysis that lead to Nicholas' original patch. Ben, Wolfram, any chance you can try benchmarking this with and without the new patch? cheers, Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"Thread Previous | Thread Next