On Fri, 20 Aug 2010, demerphq wrote: > On 20 August 2010 16:53, demerphq <demerphq@gmail.com> wrote: >> On 16 August 2010 08:19, Jan Dubois <jand@activestate.com> wrote: >>> Could you provide some evidence for this claim? The only way a >>> "better malloc" can prevent this slowdown is by doing some kind of >>> overallocation itself. >> >> This is not correct. Mallocs/reallocs that can merge blocks do not >> have the performance penalty that this algorithm seeks to work >> around. The problem here is that the Win32 realloc always copies, and >> thus extending a block a character at a time becomes exponential. >> With a realloc that merges blocks and only copies where there is >> insufficient contiguous blocks does not have this problem. > > Ill just note that im not arguing against this patch. Just that > overallocation is not the only reason that a malloc might not be > penalized by this change. Indeed, "the only way" was a rather poor choice of words. Note though that all the block merge algorithms still depends heavily on the allocation pattern. If there are no free blocks to merge, then they still provide poor performance, and that case *will* be helped by the overallocation during the growth phase. This has been shown for both FreeBSD and the Linux allocators. > One real-world benchmark that people might want to try would be to use > a routine like this: > > sub make_tree { > my ($depth) = shift; > return int rand 100 unless $depth>0; > return [ make_tree($depth-1), make_tree($depth-1) ] > } > > and then use the XS implementation of Data::Dumper to dump the results > of make_tree() for various N. > > On win32 even modest N will result in the machine essentially hanging. > On no other OS that I've tried it on is the slowdown as noticeable. > This was traced to the use of realloc in SV_GROW(). This was the > analysis that lead to Nicholas' original patch. I don't see how sv_grow() comes into this example at all. Are you talking about av_expand()? That is indeed an area that should probably be scrutinized for potential Windows performance improvements too. I also have some vague memory that hash expansion beyond something like 100,000 keys gets extremely slow on Windows. Cheers, -JanThread Previous | Thread Next