Optimised lexical assignments / faster macroöps

Paul "LeoNerd" Evans
October 6, 2021 11:18
Message ID:
A question in two parts:
  1. A specific question,
  2. ... which leads on to a broader generalisation


I've been looking lately into the details of how the optree is formed
around regular assignments into lexical variables. Assignments into
scalar lexicals aren't too bad, they look like:

  my $lex = (some RHS scalar value);

    (whatever ops for the RHS value)
    OP_PADSV [targ = $lex + OPf_REF|OPf_MOD flags]

assignments into arrays or hashes look a little bit more complex:

  my @lex = (some RHS list value);

      (whatever ops for the RHS value)
      OP_PADAV [targ = $lex + OPf_REF|OPf_MOD flags]

with similar for hashes except using OP_PADHV

I can't help thinking that this puts quite a bit of churn on the value
stack, and also the markstack in the list assignment case. I wonder
whether in these relatively-common cases, it might make sense to
peephole-optimise a set of three specialised ops that use op_targ to
store the pad index of a variable to be assigned into; thus turning
these cases into:

  OP_PADSV_STORE=UNOP [targ = $lex]
    (whatever ops for the RHS value)

  OP_PADAV_STORE=LISTOP [targ = @lex]
    (whatever ops for RHS value)

  (plus OP_PADHV_STORE which would look similar)

To this end I might have a go at making a little CPAN module for doing

It would also be useful to measure whether it actually makes any
performance benefit. If so it might become a useful core addition.


Except, now this leads me onto the larger question. There's nothing
*particularly* specific to lexical assignments about this. I'm sure
similar optimisations could be argued about for many other situations
in core perl.

Right now, we have a few bits of core already that do things like this;
e.g. OP_MULTICONCAT or OP_AELEMFAST_LEX, which are just high-speed
optimisations of common optree shapes. They take the observation that
running a few, larger ops ends up being faster overall than lots of
small ones.

It's a nice pattern - I've already written a module for doing similar
things to MULTICONCAT with maths operations:

This is also loosely inspired by Zefram's

I wonder if there is scope for creating a general way to do this sort
of thing, and a way to measure the performance boost you get from doing
that in any reasonable workflow, to judge whether it's worth doing that.

Paul "LeoNerd" Evans      |  |

