develooper Front page | perl.perl5.porters | Postings from March 2006

replace S_new_HE with Perl_new_body ?

Jim Cromie
March 25, 2006 16:27
replace S_new_HE with Perl_new_body ?
Message ID:

attached patch is an RFC patch - to find out whether its worth
tweaking HE allocations any further.

current code has  S_new_he(), S_more_he(), which duplicate
functions S_new_body(), S_more_bodies() in sv.c, but avoid runtime 
sv-type indexing.

patch exposes S_new_body (ie converts it to Perl_new_body),
which makes it available to simplify HE allocation in hv.c.

patch has 4 ifdefd code-chunks, for 4 values of  HE_ALLOC
that replace current HE allocation mechanics as follows:

/* Use Arenas for HE allocation, 1 of 4 ways:
   1 legacy:    private new_he, more_he
   2 borrowed:    use Perl_new_body() from sv.c
   3 inline:    new_body_inline from sv.c
    4 double-inline - extra inlining.
Obviously, we only need 1 way to do it.

borrowed mode maps new_HE macro to call Perl_new_body(). 
This incurs extra stack usage (for sv_type), and indexing with it,
vs legacy, where its hard-coded.

inline mode copies new_body_inline macro into hv.c, then uses it in
S_new_HE().  Latter is still a real function, since its called from 
which depend on a 'return' value.  This should be equal performance to 
legacy mode.

double-inline mode changes the assignment call-sites to new_HE_in() macros,
which expand by default (single-inline) to the same assignment.  When 
is enabled, new_HE_in expands to a direct call to new_body_inline(), 
the function call overhead.

The sizes that result from all these combos are as one would expect,
borrowing is smallest, inline loses some of the shrink, but is still smaller
than original, double-inline enlarges things by a couple hundred bytes.

I havent benchmarked the varieties, except with make test. results follow.

   text    data     bss     dec     hex filename
2122726    6772    2716 2132214  2088f6 perl
  21662       0       0   21662    549e hv.o

[jimc@harpo new-body]$ size perl hv.o
   text    data     bss     dec     hex filename
2122877    6772    2716 2132365  20898d perl
  21662       0       0   21662    549e hv.o

   text    data     bss     dec     hex filename
2122781    6772    2716 2132269  20892d perl
  21566       0       0   21566    543e hv.o

real    9m43.410s
user    5m22.896s
sys     0m21.821s

   text    data     bss     dec     hex filename
2122813    6772    2716 2132301  20894d perl
  21599       0       0   21599    545f hv.o

real    9m43.624s
user    5m26.228s
sys     0m21.857s

   text    data     bss     dec     hex filename
2122997    6772    2716 2132485  208a05 perl
  21785       0       0   21785    5519 hv.o

real    9m56.889s
user    5m35.601s
sys     0m22.673s

Apparently, these timings dont show real performance differences.
I suspect folks might have a preference based on code clarity (assuming 
the ifdefs are stripped)
Maybe a better benchmark would show something more definitive.
If INLINE or DOUBLE-INLINE dont demonstrate any advantages,
then BORROWED is probably the best way.  (we can drop export of 
get_arena too)

comments ? Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About