Front page | perl.perl6.internals |
Postings from September 2002
Re: Of PMCs Buffers and memory management
From: Mike Lambert
September 29, 2002 00:39
Re: Of PMCs Buffers and memory management
Message ID: Pine.LNX.email@example.com
> >>First and foremost, is there any compelling reason, to have totally
> >>different structures for PMCs and Buffers?
> >>- Both have a ->data aka ->bufstart
> >>- Both have ->flags, that have vastly the same meaning.
> > As jason said in another message, Dan has changed his mind from
> > yesteryear, and decided that buffers and pmcs should be the same
> > structure. There are a few ideas of my own that would be better
> > implemented if we unified the two,
> Are there any additional hints or pointers regarding this?
As far as things that could be done? There've been some discussions on the
mailing list before, but nothing really concrete. Read the thread starting
I have a semi-todo list of things I'd like to get done at some point
regarding the GC system. I can nicefy this and post it if you like.
> > I could attempt a piecemeal conversion, submitting patches that get us a
> > bit closer, except that each patch would not be acceptable on its own due
> > to the confusion introduced. ie, having PMC use BUFFER_*_FLAGs,
> The internals during changes could be hidden with some #defines. So the
> surface would stay the same.
Yeah, but that seems even worse than the approach I mentioned above, only
because it requires yet another step to untangle and undo the defines
> > ... or having
> > worse memory usage/dod-speeds because of the larger size of buffers/pmcs
> > after they are unified, etc.
> Don't think so, that a unified type has to be larger. I tried to layout
> a data hierarchy, which basically should work, when other usages of e.g.
> "flags" and "buflen" in other structures or protypes are first renamed
> (s. attached test prog, native types used for brevity).
Well, currently hashtables require a surrounding PMC, which has a data
poiinter to a Buffer. So unifying the two would allow these two structures
to be combined, and have a lesser total footprint. But stuff like strings
or perlint pmcs, don't use two buffers, and so would actually be somewhat
larger if we unified the types.
Granted if PMCs became buffers the base buffer "class" wouldn't need
synchronized or cache, but it'd still need next_for_GC (yeah, we disagree
here, I'll argue that point below ;), and possibly room for a vtable. And
to do some of the things I hinted at before, we'd need a pointer back to
the header pool, which isn't kosher with Dan, unless I'm able to
demonstrate performance win.
> [ recursive marking ]
> > So while it may seem more memory efficient to not use next_for_GC, it
> > actually isn't. A linked list of 500 elements would cause 500 recursive
> > calls and use more memory than would a next_for_GC solution.
> I'm not aware of such a deeply nested list. But as marking now knows of
> e.g. array of PMCs, it could mark a linked list of PMCs as well, w/o
> deep recursion.
Yes, arrays are a more efficient data structure than linked lists, and
arrays would not totally have the problem of recursive marking blowing the
stack. However, are you going to impose restrictions on users of Perl6
code, telling them that they shouldn't be allowed to create linked lists?
If the programmer creates a linked list in Perl of 500 elements, it could
easily blow the stack. If I were programming and had my working test
program fail as I tried to extend it to realistic data sets, I would be
quite pissed. :)
> >> SYNC *synchronize; /* undocumented + unused */
> > This is for multi-threaded access, where you need to synchronize on
> > something as a way to control access to the PMC. Of course, this is
> > entirely placeholder, as we don't have multi-threading or multiple
> > interpreters. :)
> Would a "in_use" bit not suffice?
This was pretty much placeholder, I think, as none of the logistics or
semantics have been defined. To do synchronization properly, you need an
OS which can give you the ability to create atomic locking, otherwise it
is impossible to be 100% correct. SYNC* would probably point to this OS
synchronized lock or somesuch. I'm only guessing, as I'm not intimately
familiar with multithreading implementations.
> >>What are: arena_base->extra_buffer_headers;
> > ... and
> > maybe we should force all headers to come from header pools.
> I think, we need just the sized pools, keeping things of same size
> together and one unsized pool. Both in two variants for vars/constants.
Since header pools are contiguous blocks of memory that are split up into
consecutive headers. It's pretty much impossible to have an unsized pool
of headers. It is possible, however, to have a pool of unsized-header
*pointers*, and that's exactly what extra_buffer_headers is.
Currently, we group all headers of the same size in the same header pool,
although only constants string headers currently have their own pool.
Namely, because we don't really have constants of anything else
implemented yet. :)
> > ... But there is
> > no compelling reason to do so, at this point in time. (I have some ideas
> > that would require it, tho)
> Could you elaborate on these ideas?
I guess I will need to write up those ideas. :)
> > ... I don't think we want interpreters appearing and
> > disapppearing with references...they should be explicitly created and
> > destroyed.
> Actually, it's not a big difference, how they are destroyed, but we have
> already a "newinterp" opcode, so a interpreter PMC class just needs a
> custom destroy method - that get called too ;-)
> Though, if nested structures inside the interpreter are all buffers,
> destroying them would neatlessly fit into the framework.
Yes, it would. But a lot of the interpreters structures have data fields,
and those don't work too well as buffer data. They could work as part of a
sized buffer header, I suppose. I think it would be much easier to make
the interpreter PMC-ish, or at least have a PMC wrapper. Then this PMC
can have an active-destroy method, which would properly clean up
everything that needed to be cleaned up. Since the interpreter memeory
would be malloc-allocated, it wouldn't be copied or cleaned on it's own.
The PMC would become an interface for the GC system to control the
lifetime of the allocated interpreter memory, since the GC system would
control the PMC.