develooper Front page | perl.perl5.porters | Postings from September 2014

sv_grow() and malloc()

Thread Next
Dave Mitchell
September 23, 2014 11:50
sv_grow() and malloc()
Message ID:
(this is spin-off from Reini's suggested work to tweak sv_grow()

I've been thinking about about all our heuristics for growing (and
over-growing) strings in sv_grow(), and how it's likely to interact with
the OS's malloc() library.

It occurred to me that malloc() libraries are likely to have some
reasonably sane and predictable behaviour (ha ha, I know, I know...).
For example, since malloc() guarantees alignment for any kind of variable,
its likely to allocate blocks of at least 4/8 bytes on 32/64-bit systems;
it's therefore probably not efficient to malloc PVX's of less than 4/8
bytes (we'll just waste time later calling realloc() which just returns
the same address).

Similarly, the malloc may in fact over-allocate, and in fact some quick
tests with the malloc on my 64-bit linux system implies that it initially
allocates 24 bytes, even for a malloc(1).

My two main thoughts are (a) there are probably people on this list
who know much more about malloc behaviours than me, especially on obscure
platforms - so I'd welcome any input.

Second, it occurs to me that we could probably do some run-time probing
of the malloc() library to determine optimum initial malloc() and then
realloc() sizes.

For example the following simple C code:

    #include <stdio.h>
    #include <malloc.h>

    int main(int argc, char**argv)
        int i;
        void *q, *p = malloc(1);
        malloc(1); /* poison reallocs */
        for (i=1; i<130; i++) {
            q = realloc(p,i);
            if (p != q) {
                printf("after %3d bytes realloc() using different address\n", i-1);
                malloc(i); /* poison reallocs */

gives this output on my system:

    after  24 bytes realloc() using different address
    after  40 bytes realloc() using different address
    after  56 bytes realloc() using different address
    after  72 bytes realloc() using different address
    after  88 bytes realloc() using different address
    after 104 bytes realloc() using different address
    after 120 bytes realloc() using different address

which implies the malloc() library initially allocates 24 bytes, then
reallocs in 16 byte increments. A simple probe like the above at startup
time might reveal the optimum size to initially size strings, and what
factor to round up by when reallocing.

I haven't researched this properly yet (I'm secretly hoping I wont have to
and someone already knows the answers).

No matter how many dust sheets you use, you will get paint on the carpet.

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About