Nicholas Clark wrote: > So I'm confused. It looks like some bits of perl are incredibly sensitive to > cache alignment, or something similar. This reminds me on my remarks on JITed mops.pasm which variied ~50% (or more) depending on the position of the loop in memory. s. near the end of jit/i386/jit_emit.h. And no, I still don't know what's goin on. (The story for perl5-porters + my comment: the loop is just 1 subtraction and a conditional jump. Inserting nops before this loop has drastic imapt on performance. below is the gdb output of the loop) /* my i386/athlon has a drastic speed penalty for what? * not for unaligned odd jump targets * * But: * mops.pbc 790 => 300-530 if code gets just 4 bytes bigger * (loop is at 200 instead of 196 ???) * * FAST: * 0x818100a <jit_func+194>: sub %edi,%ebx * 0x818100c <jit_func+196>: jne 0x818100a <jit_func+194) * * Same fast speed w/o 2nd register * 0x8181102 <jit_func+186>: sub 0x8164c2c,%ebx * 0x8181108 <jit_func+192>: jne 0x8181102 <jit_func+186> * * SLOW (same slow with register or odd aligned) * 0x818118a <jit_func+194>: sub 0x8164cac,%ebx * 0x8181190 <jit_func+200>: jne 0x818118a <jit_func+194> * */ > Nicholas Clark leoThread Previous | Thread Next