develooper Front page | perl.perl5.porters | Postings from August 2010

RE: Patch to make string-append on win32 100 times faster

Thread Previous | Thread Next
Jan Dubois
August 13, 2010 17:39
RE: Patch to make string-append on win32 100 times faster
Message ID:
On Fri, 30 Jul 2010, Wolfram Humann wrote:

> I would like to propose that perl requests an additional amount of
> memory from realloc when strings grow by a small amount. After
> discussion on c.l.p.m. and with the help of ideas from Ilya Zakharevich
> and Peter J. Holzer my proposal reads like this:
> (I'm not a regular user of diff and using it wrong might do more harm
> than good)
> In sv.c the following section in Perl_sv_grow():
>      if (newlen > SvLEN(sv)) {        /* need more room? */
> #ifndef Perl_safesysmalloc_size
>      newlen = PERL_STRLEN_ROUNDUP(newlen);
> #endif
> should be expanded to:
> #endif
>      if (newlen > SvLEN(sv)) {        /* need more room? */
>      size_t minlen  = SvCUR(sv);
>      minlen += (minlen >> PERL_STRLEN_EXPAND_SHIFT) + 10;
>          if (newlen < minlen) newlen = minlen;
> /*        newlen += (newlen >> 1) + 10; */
> #ifndef Perl_safesysmalloc_size
>      newlen = PERL_STRLEN_ROUNDUP(newlen);
> #endif

Thank you for your patch!  I've now committed a slightly cleaned up version
to the Perl repository, and added your email address to the AUTHORS file:

Beyond formatting changes I did:

* change size_t to STRLEN for internal consistency
* removed the commented out code
* moved the PERL_STRLEN_EXPAND_SHIFT definition to perl.h

The discussion of this change seemed to have stalled, but I see
+1 votes from David Golden and Marvin Humphrey, with additional
information from Ben Morrow that the patch also helps on FreeBSD
(when building without -Dusemymalloc), with nobody voicing
any disagreement about applying the patch.

I've kept the original constants (PERL_STRLEN_EXPAND_SHIFT being
2, plus the minimum addition of 10).  While the shift overallocates
by 25%, this is based on the original length of the buffer, not
the requested new length. The reported 12.5% overallocation used
by Python (for arrays, not strings) is based on the new size already,
not the old.

The advantage of this algorithm is also that we don't overallocate
at all if the buffer already grows by more than 25%, so we won't
"waste" any space if 2 long strings are concatenated, only small
strings are repeatedly concatenated to an already long one.

I don't care much either way about the constant 10, but the price
for it is small: on a 32-bit system a null-string SV already
occupies 28 bytes (24 + 1*4), so once it grows to 32 bytes,
the "10" will actually expand it to 36 (24 + 3*4) bytes instead,
which doesn't matter either way.

Things that could benefit from further discussion are:

a) the algorithm in sv_grow() for expanding SVs with SvOOK

	if (newlen > SvLEN(sv))
	    newlen += 10 * (newlen - SvCUR(sv)); /* avoid copy each time */

   This doesn't make much sense, and is catastrophic when
   you append a very long string to a short one.  I have not looked
   into the heritage of this algorithm yet.

b) Should the new over-allocation also be used under -Dusemyalloc.  It
   will provide less benefit there, but also come at a lower cost because
   the newlen can still fall inside the already allocated bucket size
   anyways.  I don't have any strong opinion if it should be disabled
   in this situation or not.


Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About