Front page | perl.perl6.internals |
Postings from June 2001
Re: A quick sketch of the interpreter
From: Dan Sugalski
June 15, 2001 15:19
Re: A quick sketch of the interpreter
Message ID: firstname.lastname@example.org
At 01:33 PM 6/15/2001 -0700, Benjamin Stuhl wrote:
>--- Dan Sugalski <email@example.com> wrote:
> > =head1 Stacks
> > The stacks are at least:
> > =over 4
> > =item Temp stack
> > for squirreling away the contents of individual registers
> > =item Register stack
> > For pushing the entire register file at once. There are
> > four sets, one
> > for each register type.
> > =item state stack
> > For the interpreter's internal state
> > =back
>Perl 5-ish save stack for dynamic scoping? (whatever term
Yep, I think so, if we need to do that. I'm not sure we want to duplicate
the save stack--it's clever but kinda dodgy. Probably pile on the scope
cleanup operator or something, though I suppose that's functionally equivalent.
>What is the subroutine calling convention? Caller cleans or
Callee, I think. Most subs will probably unconditionally snag a new set of
> > =head1 Registers
> > We have four sets. Each set has 64 members
>Do we really need 64 ints and 64 floats? 64 stringish ones
>I can understand (sort of) - the RE engine could use them.
The RE engine will probably heavily abuse the int file as well, to track
positions and backtrack and other odd things like that. I'm not sure that
64 will be enough in some cases but, then, for some regexes we can't
possibly have enough.
>Maybe only 32 each of ints and floats?
Well, we get consistency and a small amout of simplicity that way. It also
means that we don't have to worry about making a messup in the compiler
somewhere that uses the wrong register count and tries for floating-point
register 44 by mistake. (A small thing to be sure, but asymmetry's a good
spot for slipups to occur)
>Also, what about the
>suggestion to have the various special values
>(&PL_sv_undef, &PL_sv_yes, &PL_sv_yes) be registers (so
>undef $foo becomes 'st sp_reg0, $foo' or somesuch)? Also,
>what about having one or more of the registers be the
>'lexical state register(s)' to inplement pragmas (or is
>this the state stack?)?
I'm not sure if the special constants will be registers. I don't know that
there's enough of a benefit to be worth it--they'll probably end up as
guaranteed constants in the constants section or something.
We'll have special-purpose stuff, of course, like the opcode pointer and
such. I've not really been thinking of them as registers, though they
really are in one sense.
> > =head1 Opcodes
> > Opcodes are all dispatched indirectly via an opcode
> > function
> > table. Each segment of bytecode (a segment roughly
> > corresponding to a
> > compilation unit--a precompiled module would be in its
> > own segment,
> > for example) has its own opcode function table.
>Be wary of this. I tried this in Perl 5 (on an old sun4c,
>granted), and I came out with something like a 5% slowdown
>over having the function pointer actually stored in the op,
Yep, I know that number as I did the same test myself a few years back. The
reduced opcode size didn't make a difference, alas.
We're shooting for modular, precompiled, shareable bytecode modules with
the ability to define opcodes and potentially override opcode functions.
Unfortunately that means that if we don't go with an indirect function
table we lose a lot of that and have to preprocess the bytecode on loading
to both find all the pointers and then fix up the loaded bytecode.
That means no mmapping (sort of--certainly no shared segments) and it
forces us to load in all the bytecode in memory since we need to process
it. (We can possibly get lucky in mapping in large modules where we don't
actually load in some pages from disk--if we're mapping memory to the file
and we don't touch chunks of the bytecode because a module's got a lot of
functionality we don't use, we don't bother with the I/O to read it in)
It may turn out not to be a performance win, in which case we'll go for Plan B.
> > =head1 The opcode loop
> > This is a tight loop. All it does is call an opcode
> > function, get back
> > a pointer to the next opcode to execute, and check the
> > event dispatch
> > flag. Lather, rinse, repeat ad infinitum.
>How does this port to a TIL form?
Badly. :) We'll need to insert event checking code into the generated TIL,
or figure out some way to wedge into the platform interrupt/async system.
(I'd bet on the former, though)
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
firstname.lastname@example.org have teddy bears and even
teddy bears get drunk