Front page | perl.perl5.porters |
Postings from February 2010
Re: [patch 0/3] rework sv.c body-inventory mechanics
Thread Previous
|
Thread Next
From:
Nicholas Clark
Date:
February 1, 2010 06:43
Subject:
Re: [patch 0/3] rework sv.c body-inventory mechanics
Message ID:
20100201144224.GA14952@plum.flirble.org
On Sun, Jan 31, 2010 at 11:23:01AM -0700, Jim Cromie wrote:
> the real issue is what new features it may support:
>
> - early reclaim of PTEs.
>
> When Storable::freeze() is used to create ~/.cpan/Metadata,
> ~2**19 (iirc) PTEs are created, then freed back to PL_bodyroots[x],
> but this free-list hangs out til process termination.
> Surely we have better things to do with that memory than
> to save it JUST IN CASE we need to freeze another dataset.
But that's not a normal use case.
Most people *aren't* running the CPAN client in production code.
And if code does something repeated at all, it's more likely to do it at
about the same size. (I'm thinking of loops, or multiple calls to the same
function)
When I changed the pointer table code from calling malloc() to using
arenas:
http://perl5.git.perl.org/perl.git/commit/32e691d01937c3a1
it resulted (IIRC) in a 10% speed up in thread creation. I don't think that
I measured it, but I'm assuming that a fair chunk of it was due to avoiding
repeated malloc()/free(), which is what the arenas are there for.
> - segregated allocation
>
> Tim Bunce was recently asking for private-arenas to support NYTProf.
> My view is you dont really have a proper private-arenas facility
> until you can recycle them.
Strictly Tim was looking for a way for the use of NYTProf to avoid perturbing
the allocation behaviour of program. In particular, it was the problems
caused by SV heads being re-used when something still has a pointer to a head
related to the old use.
So a simpler solution to *that* problem would be to find a way to switch to
a mode where no SV heads are reallocated. But that needs to be a reversible
switch, else the profiler would effectively cause perl to leak memory.
If NYTProf *really* wanted to avoid perturbing the interpreter's data
structures, it wouldn't use them. It would be written to use its own,
independent storage.
The code that you're proposing to change wouldn't help here, as the code
you're proposing to change only allocates SV bodies (and HEs and PTEs), not
SV heads.
> Conceivably, a routine populating a data structure could push
> fresh/empty freelists into the interpreter, and then
> all allocations made by the routine or its callees would be made from
> known arenas. In the extreme case,
> the sv-typed arenas could be sized such that ALL allocs
> done in the routine would fit in one arena each.
> This could be useful in somewhat esoteric situations.
Sure, but it could do that already *today*, by messing with the arena heads.
> - use less memory
> early PTE cleanup qualifies as one tool with benefits
> here.
The only thing allocated by the arena code that might benefit from this are
PTEs. SV bodies and HEs are all hung off SV heads - there's no use to having
them tagged - the tag needs to go on the SV head (or its arena) instead.
The code for the PTEs use of arenas is simpler than the code for the SV
bodies, with the bugs and the fragility in the latter, not the former. So,
it would make more sense to split the PTE code from the rest of the bodies,
before starting to modify it.
However, if one then starts to look at the *usage* patterns for PTEs, both
by ithreads and by Storable, one sees that it's actually quite different from
SV bodies and HEs). Bodies and HEs are freed and reallocated in general use.
Whereas PTEs are *only* allocated by the pointer table code. In turn, the
pointer table code never *re*allocates PTEs - it's a monotonic allocation
pattern followed by complete freeing in ptr_table_clear().
In which case, changing the code from
malloc() space for arenas
walk them, to create a link list
create and use a ptr_table
fast allocation from the arena
free the ptr_table
fast free to the arena
...
create and use a ptr_table
fast allocation from the arena
free the ptr_table
fast free to the arena
to
create and use a ptr_table
malloc() space for arenas
walk them, to create a link list
fast allocation from the arena
free the ptr_table
fast free to the arena
free() the arenea
...
create and use a ptr_table
malloc() space for arenas
walk them, to create a link list
fast allocation from the arena
free the ptr_table
fast free to the arena
free() the arenea
seems like it will often *create* more work than it avoids.
This analysis makes me think that actually, yes, the PTE allocation code *is*
sub-optimal. But not in the way that you're proposing to fix it.
Instead, I think that the following would improve performance, *and* return
more memory to malloc()
* Remove PTEs from the common arena code/data structures
* Give each ptr_table its own arena chain
* Don't bother with creating the linked list - use two pointers to operate a
slab allocator for the arena
* Don't bother with deleting PTEs - just walk and free() the arena chain in
ptr_table_clear
Likely I would have had this half written in less time than it took to write
the e-mail, except that this would fall foul of the 5.12 feature freeze, and
I'm loathe to write code which I know before starting can't be merged as soon
as it's finished.
Nicholas Clark
Thread Previous
|
Thread Next