Front page | perl.perl5.porters |
Postings from February 2015
OP_SIGNATURE
Thread Next
From:
Dave Mitchell
Date:
February 22, 2015 20:50
Subject:
OP_SIGNATURE
Message ID:
20150222204954.GA28599@iabyn.com
The branch smoke-me/davem/op_signature, currently being smoked, adds
a new op, OP_SIGNATURE to the core. The principal purpose of this
op is to handle (most of) the work of assigning args to params on
signature subs (e.g. sub foo ($a, $b, $c = 1) {...}).
Currently this process is done by lots of individual ops, and is very
slow. The OP_SIGNATURE op uses an op_aux array similarly to OP_MULTIDEREF,
containing lists of actions plus simple default arguments (currently
0, 1, an IV, a const SV, a pad var, or a package var). More complex
defaults are left as sequences of ops after the OP_SIGNATURE.
This changes signatured subs from being very slow to being even faster
than traditional subs.
Since this is an internal implementation detail change to an experimental
feature, I'm hoping that it will be non-controversial to merge it into
bleed once it passes smoking.
This branch also contains a second significant commit, which optimises
simple uses of my(...)=@_ at the start of a sub into an OP_SIGNATURE.
This is currently *not* enabled by default; it requires you to build
with -DPERL_FAKE_SIGNATURE. Since its not built by default, I'm assuming
that this will also be non-controversial for blead. I envisage enabling
it by default for 5.23.0.
This branch also contains some rationalisation of when peep(),
finalize_optree() etc are called; they tend to all get called at the same
time, with some other miscellaneous tidying up, so I've bundled them
all into single wrapper function called S_postprocess_optree().
I've then added another function called prefinalize_optree(), which
is like finalize_optree(), but which gets called *before* peep():
so you get a chance to mess with the optree before it's been optimised.
In terms of benchmarks: these subs:
sub plain { my ($a, $b, $c) = @_;1 }
sub sig ($a, $b, $c) {}'
when being invoked as:
my $self = {}; plain($self, 1,2);
my $self = {}; sig($self, 1,2);
see the following numbers of instruction reads under bench.pl:
plain sig
1492 2349 blead
1492 1108 my branch
1119 1108 my branch with -DPERL_FAKE_SIGNATURE
This shows that slow signature functions become even faster than plain
functions, and that plain functions can join in the fun by using
PERL_FAKE_SIGNATURE.
Other metrics like Dr, Dw show similar improvements, and other
permutations of signatures and default args also show similar benefits.
Finally here are the commit messages of the two main commits:
commit bf73441e0bcd843be03a50825841bf4967fd8cba
Author: David Mitchell <davem@iabyn.com>
AuthorDate: Tue Jan 20 17:45:55 2015 +0000
Commit: David Mitchell <davem@iabyn.com>
CommitDate: Sun Feb 22 18:44:01 2015 +0000
add OP_SIGNATURE
In the current (experimental) subroutine signatures implementation, the
checking for sufficient numbers of args, the introduction of lexical vars,
and the assignment of @_ and/or default values to them, is done by placing
lots of individual perl ops at the start of the function body.
For example, this:
sub f ($a, $b = 0, $c = "foo") {};
deparses as:
sub f {
die sprintf("Too many arguments for subroutine at %s line %d.\n", (calle
r)[1, 2]) unless @_ <= 3;
die sprintf("Too few arguments for subroutine at %s line %d.\n", (caller
)[1, 2]) unless @_ >= 1;
my $a = $_[0];
my $b = @_ >= 2 ? $_[1] : 0;
my $c = @_ >= 3 ? $_[2] : 'foo';
();
}
which is of course *very* slow.
This commit adds a new op, OP_SIGNATURE, of type UNOP_AUX, which handles
most of this work in a single op. It's inspired by my OP_MULTIDEREF work,
where 'simple' indices and keys (integers, consts, simple lex and package
vars) were handled directly by the op using values and pointers stored in
the op_aux array. For OP_SIGNATURE, simple default args (ints, consts, and
sometimes lexicals and package vars) are stored in the op_aux struct,
while more complex expressions are compiled as assign statements following
the OP_SIGNATURE op. For example:
sub f ($a, $b = 0, $c = "foo", $d = $c+1) {};
now gives
$ ./perl -Ilib -MO=Concise,f,-exec /tmp/signatures.t
main::f:
1 <;> nextstate(main 78 p:6) v:%,469762048
2 <+> signature($a, $b=0, $c="foo", $d=<expr>) v
3 <;> nextstate(main 81 p:6) v:%,469762048
4 <0> padsv[$c:80,82] s
5 <$> const[IV 1] s
6 <2> add[$d:81,82] sK/TARGMY,2
7 <;> nextstate(main 82 p:6) :%,469762048
8 <1> leavesub[1 ref] K/REFC,1
The performance is impressive, with a simple
sub f($a, $b, $c) {}
now being typically faster than
sub { my ($a, $b, $c) = @_; }
and signatured subs being typically twice as fast to call as they were
before.
As well as benefiting from doing nearly all the work in a single op,
the fact that it is the first op in the function allows some fairly
aggressive optimisations. In particular we know that newly introduced
lexicals will always be undef and non-magical, unlike for example:
f();
my $x; # $x not undef here!
sub f { $x = 1 }
Also, it is likely that subs are always called with a particular arg being
always the same type; for example, methods will always pass a ref as arg
1. Given the fact that at end of scope, lexical vars with a ref count of 1
are simply cleared in place, then in this case, by the time of the second
call, the lexical var $self in
sub f($self, ...) {...}
is likely to be a !SvOK SV with a body of type SVt_IV (already suitable
for holding an int or RV). If OP_SIGNATURE sees that the passed arg is
SvROK, it checks whether the lexical is of type SVt_IV, and if so just
directly does
SvRV_set(...);
SvROK_on(...)
rather than calling sv_setsv(). A similar short-cut is performed for
SvIOK values.
Default values of 0 and 1 are special-cased with their own action, so
no extra data needs storing in the op_aux array. Other integer-valued
constant defaults are stored as an IV in op_aux. More general constants
are stored as an SV pointer (or pad offset for threaded builds) in op_aux.
Default values that are simple lexicals or package vars, such as
sub f ($a, $b = $a, $c = $::Foo)
are usually stored as a pad index or GV pointer in the op_aux array.
There is a complication here in that more complex defaults are stored as
ops that get executed *after* OP_SIGNATURE, which means that default arg
processing can get re-ordered. For example,
sub f ($a, $b = $a++, $c = $a) {}
might get executed as the equivalent of
# happens within OP_SIGNATURE:
$a = $_[0];
$b = $_[1];
$c = $_[2] // $a;
# delayed: happens afterwards
$b //= $a++;
since the '$a' default is handled directly by OP_SIGNATURE, while '$a++'
is postponed. Clearly in this case this would be wrong, so the '$a'
default is in fact handled outside the OP_SIGNATURE too in cases like
this, leading to correct execution roughly like:
# happens within OP_SIGNATURE:
$a = $_[0];
$b = $_[1];
$c = $_[2];
# happens afterwards
$b //= $a++;
$c //= $a
This commit also makes signatured subs deparse correctly for the first
time, and also makes t/op/signature.t pass under TEST -deparse.
Note that this commit introduces a hard limit of 32767 parameters for any
signature sub, but I can't conceive of that being an issue.
commit e8c15d5fb2f6abb65aa9692cc467a567a5e6c009
Author: David Mitchell <davem@iabyn.com>
AuthorDate: Mon Feb 16 17:32:45 2015 +0000
Commit: David Mitchell <davem@iabyn.com>
CommitDate: Sun Feb 22 18:44:01 2015 +0000
make my(...)=@_ use OP_SIGNATURE
This isn't yet enabled by default: it requires perl to be built with
PERL_FAKE_SIGNATURE defined.
Where the first statement in a function is a simple
my (....) = @_;
with the my elements being any mixture of scalars or undefs, with an
optional final array or hash, then convert that subtree of ops into
a single OP_SIGNATURE op, which will be faster.
The op will have the OPpSIGNATURE_FAKE private flag set, to distinguish
it from real signatured subs.
--
The warp engines start playing up a bit, but seem to sort themselves out
after a while without any intervention from boy genius Wesley Crusher.
-- Things That Never Happen in "Star Trek" #17
Thread Next
-
OP_SIGNATURE
by Dave Mitchell