Front page | perl.perl6.internals |
Postings from July 2004
From: Dan Sugalski
July 14, 2004 09:03
Message ID: firstname.lastname@example.org
Okay, here's a really, really evil idea. (And yes, bluntly, it's
triggered by the pie-thon bytecode translator's needs) I need a
stack, and one that's faster than our current stack which, while
snappy for what it does, is still burdened by generality. I also need
a stack that's generally not very big. So...
The fake stack. The idea is this. PMC registers 18-29 are used as a
12 element stack. PMC register 31 is used as a stack overflow array.
Integer register I31 is used as the stack depth register. We add in
This pushes the contents of Px onto the top of the stack, or pushes
the stack down by Ix entries. Note that the Ix form does *not* set
these new slots to anything! They're left as-is, which can be an
issue for the GC.
This pops Ix entries off the stack, default of one. Note that there's
no need for any destination--if you're popping the TOS into a
register then just do a set first.
iset Px, Iy
This does a set of register Px to PMC register #Y.
iset Ix, Py
This sets the PMC register #X to Py.
The last two are pretty standard indirect register access. If someone
wants to propose a more generic syntax for it we might want to
generalize on, that's fine--do it *after* OSCON, thanks. :)
The reason for the odd register count and register number usage is
twofold. First, it leaves some of the top-half registers free for
other things. Second, the registers used will be 8-byte aligned and a
multiple of 8 bytes on systems with 4-byte pointers. Not likely a
huge deal, but it may shave a cycle off here or there, and it does
mean we have a few spare registers in the upper half.
Note that this stack does *not* need rot, swap, dup, or other funky
'move the things around' ops, since that can already be done. For
dup2 (the top two entries are duplicated)
exchange P18, P19
exchange P18, P20
exchange P19, P20
And it means that stack based ops turn to:
add (pop the top two entries, add them, and push the result):
add P17, P18, P19
set P18, P17
and something like add-in-place (new TOS = oldTOS+1 + oldTOS):
add P18, P19, P18
It seems sensible, which of course worries me, as do so many things I
think are sensible, but I think we should do this--it'll be useful
for other languages that like being stack based. (If I'd any sense,
I'd have done this ages ago as it'd have made the forth
implementation nicer) We should get x86 JIT code for the new ops,
since I expect they'll be used rather a lot.
Full-fledged tracking of used stack slots within basic blocks with
register coloring and cross-block register exchanges would, of
course, make a lot of sense and be faster, but I'm a bit pressed for
time here, so we'll make do. Only the fake push and pop ops are at
all odd, so I'll put those in experimental.ops for now. iset will go
into set.ops, though if we decide that we want a more general
indirect register access scheme we can see about renaming them. (We
probably should put in an opname aliasing feature to the pir and pasm
compilers, but we'll deal with that later, unless someone's feeling
like a project)
--------------------------------------it's like this-------------------
Dan Sugalski even samurai
email@example.com have teddy bears and even
teddy bears get drunk
by Dan Sugalski