develooper Front page | perl.perl6.internals | Postings from August 2001

Re: Opcode Dispatch

Thread Previous | Thread Next
Bryan C . Warnock
August 7, 2001 06:41
Re: Opcode Dispatch
Message ID:
On Monday 06 August 2001 09:08 am, Bryan C. Warnock wrote:
> It could be that part of the "fixup" is to convert from bytes to wider
> ops, or something similar.  If that's the case, I can patch the code and
> rerun it.

Okay.  I rewrote the code from scratch.  (Rev 2 is always better anyway.)
Same machines as before. 

I followed Dan's recipe (for the most part).  The opcodes are now 32 bits 
wide, and each opcode takes 0, 1, or 2 arguments. 

I tested with 512, 1024, 2048, and 8192 opcodes (all contiguous) in a single 
table.  I did not do any sort of context switching between multiple tables.

The 8192-* tests did not complete, and I've scrapped them.  (As you'll see, 
some of the tests were insane, and gcc was having fits attempting to 

The 2048-* tests did not complete on Solaris.  (The tests ran for about 
seven hours.)  I've reported the partial results, and you should be able to 
extrapolate the remainder.

I tested a full table lookup dispatch, a full switch dispatch, and a partial 
switch / the rest lookup dispatch.

The full switch had both a normal and an inlined NO-OP opcode variant.

The partial switch would switch on 32, 128, or 256 opcodes (all contiguous), 
and had normal, inlined NO-OP, and ully inlined switch variants.

Tests were run with both gcc's debugging '-g' and optimization '-O2' flags.
Infortunately, I didn't time the actual compilation of each test.  Some of 
them were taking quite a while, and that, of course, should come into play.

Each data set consisted of 40,000 opcodes (randomly distributed between 
opcode 2 and opcode[-1]) and their arguments, appended with a single opcode 
1 (program termination).  The data was interspersed with 7% NO-OP opcodes.  
This "program" was looped through 5000 times.

A summary of results:

Full switches are right out, and will not be tested again.  They were the 
slowest of the constructs, and usually by a lot.

For Linux/x86, lookup consistently faster with no optimizations.  With 
optimizations, lookup was the fastest with the smallest number of opcodes.  
As more and more opcodes were added, some of the inlined partial switches 
were just as efficient as a lookup.

For Solaris/Sparc, the inlined-variant partial switches were fastest with 
the smaller number of opcodes and case statements.  As the number of opcodes 
increased, lookup became slightly faster with optimized code, but 
consistently slower with the debug code.

The complete results can be found at

Bryan C. Warnock

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About