Front page | perl.perl6.internals |
Postings from April 2002
Re: COW Revisted?
From: Dan Sugalski
April 28, 2002 11:33
Re: COW Revisted?
Message ID: firstname.lastname@example.org
At 1:51 PM +0200 4/28/02, Peter Gibbs wrote:
> > The data which needs to be stored along with the buffer data, can be
>> stored as either a header or a footer. The size of this header needs to be
>> a multiple of 16 (or possibly even 8) bytes, so that the real buffer
>> which follows would be correctly aligned. I'm not sure if this applies for
>> a footer.
>> A footer might allow us to to tack 5-8 bytes on to the end of an
>> allocation, which might not always go 'over' the 16 byte rounding-up limit
>> we currently have in place. As long as we're not just-below the 16 byte
>> barrier on most of our allocations, this shouldn't waste any more memory.
>My current code (see
>one flag byte within the buffer itself, provided the buffer is always
>guaranteed to be large enough to hold a pointer, which the current 16-byte
>allocation scheme ensures. To implement substrings requires an additional
>pointer (or offset) in the string header.
>I was assuming that we only want to implement COW for strings, as non-string
>buffers are generally speaking not under our control.
Fair enough. A minimum allocation of one pointer plus a byte isn't
>I still think that the cheapest implementation for the flag byte is a
>footer, as we don't need to worry about alignment. This is actually a
>zero-cost option as far as memory allocation is concerned, as the current
>allocation scheme always allocates at least one extra byte.
>Variable-sized string headers are not really an option; if the overhead of
>an additional pointer is a problem, my first inclination would be to combine
>the chartype and encoding pointers into a single vtable entry.
It wasn't the extra pointer so much as the guaranteed extra 16 bytes
per allocation if we went with a header scheme. (Since we guarantee
16-byte alignment at the moment, we have to parcel out memory in 16
At this point I think we should do COW, and I think I know how to do
What we need is separate allocation routines for String and Buffer
data. String data doesn't need to be aligned, so we don't have to
bother with 16 byte chunks. Going 2 or 4 byte chunks should be
sufficient for Strings. I'd been conflating string and non-string
general memory and, while that made for a smaller API, it also is
rather wasteful for string data, of which we'll have a lot.
As part of that, I think we could also do with an overhaul of the
allocation system to better handle constants as well. We yank
constant data all over the place for no good reason. Constants are
immortal so if they're in their own pool there's no reason at all to
collect the things. (Well, not often--if we support module removal at
runtime we can potentially have constants go away, but that shouldn't
be at all common)
So, let's do this:
1) We'll add allocate_string and reallocate_string functions, which
the strings use. It'll give us COW space at the end of the string
2) We'll add in new_*_const_header to match the new_*_header
functions, to allocate String/Buffer/PMC headers from constant header
arenas rather than from the default arenas
3) We'll add in (re)?allocate_const functions to allocate memory from
constant pools rather than the default collectable pools.
4) We'll add in COW functionality for strings and see what sort of
win we get. (I'm not sure that we'll win much with general COW, since
my gut feeling is that most COW strings will have a constant string
as their source, but I could be wrong here. That'd be OK)
This should decrease the amount of data we copy on GC runs, the
number of headers we trace for DOD runs, and generally tighten up our
You up for implementing this, Peter?
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
email@example.com have teddy bears and even
teddy bears get drunk