develooper Front page | perl.perl6.internals | Postings from October 2002

Re: Of mops and microops

Thread Previous | Thread Next
Leopold Toetsch
October 28, 2002 23:36
Re: Of mops and microops
Message ID:
Dan Sugalski wrote:

> At 7:09 PM +0100 10/27/02, Leopold Toetsch wrote:
>> So the I-register access is substituted by access to 3 global integers.
>> Now, how would these globals be loaded? When are these »arg« OPs 
>> inserted?
>> Currently the register optimizer in jit.c does something very similar: 
>> Setting up register access for the most used parrot registers in one 
>> execution block + load and store add block begin/end.
> This is definitely a Clever Thing, and one I've pondered on and off. It 
> will definitely speed up some things, as there's less bytecode to chew 
> through, there's more of a chance for optimization by the C compiler 
> when parrot's being built, and generally more opportunity to cheat.

It would also solve the multi_keyed problem. The _get_keyed argument 
preparation could fetch the PMC out of the aggregate.

> I'm currently leaning against it only because it doesn't ultimately help 
> the JIT. What we have now is wildly cool and damn useful (and has anyone 
> heard from Daniel lately, BTW?) but there's room for more optimizations.

Yes, that's correct. JIT wouldn't profit currently. But with an 
optimized stream of (micro-)Ops, having optimzed fetch/store opcodes not 
in (basic-) block but finer granularity, JIT could profit too. Also the 
JIT-optimizer now run at load time would be done at compile time, so JIT 
startup time would be cut down.

> Now, on the other hand it *does* speed up the interpreter, so it's 
> definitely not an idea to discard. But if we're going to (and I'd still 
> like to hold off) I think we're better off with a few special versions 
> of ops that target one or twi registers directly, perhaps register 0 and 
> 1, rather than have a separate set of special-purpose registers.

My hack with the 3 globals includes obviously some cheating, globals are 
a nono, when having multiple interpreters. But nethertheless we could 
produce an optimized PBC stream, where the 3*4 registers are treated as 
"fast" registers, with load/store to the 32*4 slower registers only when 
necessary. This would also fit neatly with my proposal WRT keyed access.

I was also thinking of the various fixed sized integer ops for JVM or 
C#. The load/store ops would prepare integers of needed size and do sign 
extension when necessary.


Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About