develooper Front page | perl.perl5.porters | Postings from January 2003

Re: benchmarking - it's now all(-1,0,1,5,6)% faster

Thread Previous | Thread Next
Leopold Toetsch
January 12, 2003 11:06
Re: benchmarking - it's now all(-1,0,1,5,6)% faster
Message ID:
Nicholas Clark wrote:

> So I'm confused. It looks like some bits of perl are incredibly sensitive to
> cache alignment, or something similar.

This reminds me on my remarks on JITed mops.pasm which variied ~50% (or 
more) depending on the position of the loop in memory. s. near the end 
of jit/i386/jit_emit.h.

And no, I still don't know what's goin on.

(The story for perl5-porters + my comment:
  the loop is just 1 subtraction and a conditional jump. Inserting nops 
before this loop has drastic imapt on performance. below is the gdb 
output of the loop)

/* my i386/athlon has a drastic speed penalty for what?
  * not for unaligned odd jump targets
  * But:
  * mops.pbc 790 => 300-530  if code gets just 4 bytes bigger
  * (loop is at 200 instead of 196 ???)
  * FAST:
  * 0x818100a <jit_func+194>:    sub    %edi,%ebx
  * 0x818100c <jit_func+196>:    jne    0x818100a <jit_func+194)
  * Same fast speed w/o 2nd register
  * 0x8181102 <jit_func+186>:    sub    0x8164c2c,%ebx
  * 0x8181108 <jit_func+192>:    jne    0x8181102 <jit_func+186>
  * SLOW (same slow with register or odd aligned)
  * 0x818118a <jit_func+194>:    sub    0x8164cac,%ebx
  * 0x8181190 <jit_func+200>:    jne    0x818118a <jit_func+194>

> Nicholas Clark


Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About