Front page | perl.perl5.porters |
Postings from December 2007
Re: optimising opcodes
Thread Previous
|
Thread Next
From:
Jim Cromie
Date:
December 29, 2007 10:41
Subject:
Re: optimising opcodes
Message ID:
47768586.3040302@gmail.com
Nicholas Clark wrote:
> I was thinking about something Jim Cromie mailed me privately, and wondered:
> "How many null ops do we have, and how many cache misses do they cause?"
> For example:
>
> $ perl -MO=Concise -e 'print foreach @ARGV'
> lots of ex-ops. Pining for the fjords. Taking up space.
>
>
4/21 ops are (in principle) removable.
> So I wondered:
>
> 0: How hard would it be to write some code to take an optree, and copy out all
> the live ops into a new optree, skipping any nulled out ops?
>
I once had a patch to peep that did this for 1 case (ex-rv2sv). Aug 16
2004. No patch attached :-(
At any rate, DaveM thought I got lucky avoiding dragons.
> 1: Would that break the runtime?
> 2: How much would it break B::Deparse?
> 3: If you configured perl to use a slab allocator, and happened to do two
> parses (first to determine the total size of ops, second to copy them in)
> and thereby allocated a slab of the exact size, would that help more?
> 4: If you copied them in execution order, would that help more?
>
>
perf ( exec-order ) >= perf (random-order-in-a-slab)
assuming data-prefetch is operating ( instruction prefetch is better,
but our ops are data)
I dont think it would be worse. Whether its measurable is something else.
cachegrind doesnt model the microarchitectural things (ref
iiwsc2006.ppt, see valgrind.org)
If one were to write a repack_ops routine, where would one call it from ?
It has to be *after* Perl_peep in order to get the exec-order optimizations.
> $ perl -MO=Concise,-exec -e 'print foreach @ARGV'
> 1 <0> enter
> 2 <;> nextstate(main 1 -e:1) v
> 3 <;> nextstate(main 1 -e:1) v
> 4 <0> pushmark sM
> 5 <$> gv(*ARGV) s
> 6 <1> rv2av[t1] sKRM/1
> 7 <$> gv(*_) s
> 8 <{> enteriter(next->c last->f redo->9) lKS
> d <0> iter s
> e <|> and(other->9) vK/1
> 9 <0> pushmark s
> a <$> gvsv(*_) s
> b <@> print vK
> c <0> unstack v
> goto d
> f <2> leaveloop vK/2
> g <@> leave[1 ref] vKP/REFC
>
>
> [note, by doing things with a slab allocator, I believe that even though the
> transformed ops are contiguous in memory, they are also indistinguishable from
> regular ops created by the perl compiler]
>
> 5: How do you spot code executed only once, and avoid doing this?
> (main program, eval blocks, top level code from do/require/use,
> BEGIN/UNITCHECK/CHECK/INIT/END - ie basically anything not a "real"
> subroutine definition)
>
Why do you think it matters ?
Is it just avoiding cpu cycles doing stuff that has low/no payoff ?
or is is caution/paranoia, or a specific reason ?
The repack_ops() should be little more than 1 more malloc,
and 1 more tree traversal, with memcopies, filling some ptr-table hashes
to map old-to-new pointers, then iterating over them to make ptr updates.
Knowing the op-size for the memcopies is more difficult than it should be.
I sucessfully reordered the ops (ie tests didnt break) here.
http://www.nntp.perl.org/group/perl.perl5.porters/2007/06/msg125532.html
My goal was to reduce the size determination down to a binary search on
op->op_type.
aside: it is unfortunate that op_type 1= type of op (forex BINOP)
I guess op_kind is a usable pseudo-synonym
> 6: Is the op_opt flag bit free at this point? Does having a bit help?
>
>
I know I always like to get a bit more ;-)
>
> and for bonus points - why are there two nextstate ops in that example, and
> can we optimise one out? How often does it happen?
>
> Nicholas Clark
>
>
Thread Previous
|
Thread Next