On Fri, Jul 30, 2010 at 10:30:27AM -0400, David Golden wrote: > On Fri, Jul 30, 2010 at 9:37 AM, Wolfram Humann <w.c.humann@arcor.de> wrote: > > I would like to propose that perl requests an additional amount of memory > > from realloc when strings grow by a small amount. > > +1 -- though I'm not clear whether this win32 specific or whether all > operating systems should work this way. IMO all should. > -- but I'd like to understand the logic behind certain choices. We discussed this kind of algorithm at length for Lucene and Lucy, including exploration into what Python and Ruby do. Dynamic array reallocation algorithms http://markmail.org/thread/x427sku4wrnc3rjf > * Why are you suggesting making the extra allocation proportional to > the length of the string? Why not a multiple of the length of the > increment? Why not a fixed amount? IMO, the overage ought to be proportional to the space required by the final string. > * Why the "+10"? Why have a extra minimum at all? Why "10" instead > of another number? Given your examples, for a large string the 10 is > a trivial fraction of the total and for a small string, the efficiency > either doesn't matter or the string quickly grows into a large string > anyway. That's similar to my analysis. I think it's more useful to round up to a multiple of the pointer size for small values. > * Why do you think PERL_STRLEN_EXPAND_SHIFT should be 2? That's a 25% > increase in the length of the string. (In many programs, I suspect > many strings are appended to only a few times and that seems like > large memory price to pay.) We decided to go with 12.5%, similar to Python. http://markmail.org/message/obehcimh2ly4cd3l I agree, a smaller size is better. Say you start with an array which hold 800 elements but which has been over-allocated by one eighth (+100 elements = 900). Reallocating at 900 and again at 1000-something isn't that different from reallocating only once at 1000. So long as you avoid reallocating at every increment -- 801, 802, etc -- you have achieved your main goal. Both mild and aggressive over-allocation solve that problem, but aggressive over-allocation also involves significant RAM costs. Where the best balance lies depends on how bad the reallocation performance is in relation to the cost of insertion. An eighth seems pretty good. Doubling seems too high. 1.5, dunno, seems like it could work. According to this superficially credible-seeming comment at stackoverflow, that's what Ruby does: http://stackoverflow.com/questions/1100311/what-is-the-ideal-growth-rate-for-a-dynamically-allocated-array/1100449#1100449 Incidentally, the Python algo was originally intended to address poor reallocation performance on Windows: http://markmail.org/message/hk5kpmgcelvizkif Marvin HumphreyThread Previous | Thread Next