develooper Front page | perl.perl5.porters | Postings from February 2015

OP_SIGNATURE

Thread Next
From:
Dave Mitchell
Date:
February 22, 2015 20:50
Subject:
OP_SIGNATURE
Message ID:
20150222204954.GA28599@iabyn.com
The branch smoke-me/davem/op_signature, currently being smoked, adds
a new op, OP_SIGNATURE to the core. The principal purpose of this
op is to handle (most of) the work of assigning args to params on
signature subs (e.g. sub foo ($a, $b, $c = 1) {...}).
Currently this process is done by lots of individual ops, and is very
slow. The OP_SIGNATURE op uses an op_aux array similarly to OP_MULTIDEREF,
containing lists of actions plus simple default arguments (currently
0, 1, an IV, a const SV, a pad var, or a package var). More complex
defaults are left as sequences of ops after the OP_SIGNATURE.

This changes signatured subs from being very slow to being even faster
than traditional subs.

Since this is an internal implementation detail change to an experimental
feature, I'm hoping that it will be non-controversial to merge it into
bleed once it passes smoking.

This branch also contains a second significant commit, which optimises
simple uses of my(...)=@_ at the start of a sub into an OP_SIGNATURE.
This is currently *not* enabled by default; it requires you to build
with -DPERL_FAKE_SIGNATURE. Since its not built by default, I'm assuming
that this will also be non-controversial for blead. I envisage enabling
it by default for 5.23.0.

This branch also contains some rationalisation of when peep(),
finalize_optree() etc are called; they tend to all get called at the same
time, with some other miscellaneous tidying up, so I've  bundled them
all into single wrapper function called S_postprocess_optree().
I've then added another function called  prefinalize_optree(), which
is like finalize_optree(), but which gets called *before* peep():
so you get a chance to mess with the optree before it's been optimised.

In terms of benchmarks: these subs:

    sub plain { my ($a, $b, $c) = @_;1 }
    sub sig ($a, $b, $c) {}'

when being invoked as:

    my $self = {}; plain($self, 1,2);
    my $self = {}; sig($self, 1,2);

see the following numbers of instruction reads under bench.pl:

   plain   sig
    1492  2349  blead
    1492  1108  my branch
    1119  1108  my branch with -DPERL_FAKE_SIGNATURE

This shows that slow signature functions become even faster than plain
functions, and that plain functions can join in the fun by using
PERL_FAKE_SIGNATURE.
Other metrics like Dr, Dw show similar improvements, and other
permutations of signatures and default args also show similar benefits.


Finally here are the commit messages of the two main commits:


commit bf73441e0bcd843be03a50825841bf4967fd8cba
Author:     David Mitchell <davem@iabyn.com>
AuthorDate: Tue Jan 20 17:45:55 2015 +0000
Commit:     David Mitchell <davem@iabyn.com>
CommitDate: Sun Feb 22 18:44:01 2015 +0000

    add OP_SIGNATURE
    
    In the current (experimental) subroutine signatures implementation, the
    checking for sufficient numbers of args, the introduction of lexical vars,
    and the assignment of @_ and/or default values to them, is done by placing
    lots of individual perl ops at the start of the function body.
    
    For example, this:
    
        sub f ($a, $b = 0, $c = "foo") {};
    
    deparses as:
    
        sub f {
            die sprintf("Too many arguments for subroutine at %s line %d.\n", (calle
    r)[1, 2]) unless @_ <= 3;
            die sprintf("Too few arguments for subroutine at %s line %d.\n", (caller
    )[1, 2]) unless @_ >= 1;
            my $a = $_[0];
            my $b = @_ >= 2 ? $_[1] : 0;
            my $c = @_ >= 3 ? $_[2] : 'foo';
            ();
        }
    
    which is of course *very* slow.
    
    This commit adds a new op, OP_SIGNATURE, of type UNOP_AUX, which handles
    most of this work in a single op. It's inspired by my OP_MULTIDEREF work,
    where 'simple' indices and keys (integers, consts, simple lex and package
    vars) were handled directly by the op using values and pointers stored in
    the op_aux array. For OP_SIGNATURE, simple default args (ints, consts, and
    sometimes lexicals and package vars) are stored in the op_aux struct,
    while more complex expressions are compiled as assign statements following
    the OP_SIGNATURE op. For example:
    
        sub f ($a, $b = 0, $c = "foo", $d = $c+1) {};
    
    now gives
    
        $ ./perl -Ilib -MO=Concise,f,-exec /tmp/signatures.t
        main::f:
        1  <;> nextstate(main 78 p:6) v:%,469762048
        2  <+> signature($a, $b=0, $c="foo", $d=<expr>) v
        3  <;> nextstate(main 81 p:6) v:%,469762048
        4  <0> padsv[$c:80,82] s
        5  <$> const[IV 1] s
        6  <2> add[$d:81,82] sK/TARGMY,2
        7  <;> nextstate(main 82 p:6) :%,469762048
        8  <1> leavesub[1 ref] K/REFC,1
    
    The performance is impressive, with a simple
    
        sub f($a, $b, $c) {}
    
    now being typically faster than
    
        sub { my ($a, $b, $c) = @_; }
    
    and signatured subs being typically twice as fast to call as they were
    before.
    
    As well as benefiting from doing nearly all the work in a single op,
    the fact that it is the first op in the function allows some fairly
    aggressive optimisations. In particular we know that newly introduced
    lexicals will always be undef and non-magical, unlike for example:
    
        f();
        my $x; # $x not undef here!
        sub f { $x = 1 }
    
    Also, it is likely that subs are always called with a particular arg being
    always the same type; for example, methods will always pass a ref as arg
    1. Given the fact that at end of scope, lexical vars with a ref count of 1
    are simply cleared in place, then in this case, by the time of the second
    call, the lexical var $self in
    
        sub f($self, ...) {...}
    
    is likely to be a !SvOK SV with a body of type SVt_IV (already suitable
    for holding an int or RV). If OP_SIGNATURE sees that the passed arg is
    SvROK, it checks whether the lexical is of type SVt_IV, and if so just
    directly does
    
        SvRV_set(...);
        SvROK_on(...)
    
    rather than calling sv_setsv(). A similar short-cut is performed for
    SvIOK values.
    
    Default values of 0 and 1 are special-cased with their own action, so
    no extra data needs storing in the op_aux array. Other integer-valued
    constant defaults are stored as an IV in op_aux. More general constants
    are stored as an SV pointer (or pad offset for threaded builds) in op_aux.
    
    Default values that are simple lexicals or package vars, such as
    
        sub f ($a, $b = $a, $c = $::Foo)
    
    are usually stored as a pad index or GV pointer in the op_aux array.
    There is a complication here in that more complex defaults are stored as
    ops that get executed *after* OP_SIGNATURE, which means that default arg
    processing can get re-ordered. For example,
    
        sub f ($a, $b = $a++, $c = $a) {}
    
    might get executed as the equivalent of
    
        # happens within OP_SIGNATURE:
        $a = $_[0];
        $b = $_[1];
        $c = $_[2] // $a;
        # delayed: happens afterwards
        $b //= $a++;
    
    since the '$a' default is handled directly by OP_SIGNATURE, while '$a++'
    is postponed. Clearly in this case this would be wrong, so the '$a'
    default is in fact handled outside the OP_SIGNATURE too in cases like
    this, leading to correct execution roughly like:
    
        # happens within OP_SIGNATURE:
        $a = $_[0];
        $b = $_[1];
        $c = $_[2];
        # happens afterwards
        $b //= $a++;
        $c //= $a
    
    This commit also makes signatured subs deparse correctly for the first
    time, and also makes t/op/signature.t pass under TEST -deparse.
    
    Note that this commit introduces a hard limit of 32767 parameters for any
    signature sub, but I can't conceive of that being an issue.


commit e8c15d5fb2f6abb65aa9692cc467a567a5e6c009
Author:     David Mitchell <davem@iabyn.com>
AuthorDate: Mon Feb 16 17:32:45 2015 +0000
Commit:     David Mitchell <davem@iabyn.com>
CommitDate: Sun Feb 22 18:44:01 2015 +0000

    make my(...)=@_ use OP_SIGNATURE
    
    This isn't yet enabled by default: it requires perl to be built with
    PERL_FAKE_SIGNATURE defined.
    
    Where the first statement in a function is a simple
    
        my (....) = @_;
    
    with the my elements being any mixture of scalars or undefs, with an
    optional final array or hash, then convert that subtree of ops into
    a single OP_SIGNATURE op, which will be faster.
    
    The op will have the OPpSIGNATURE_FAKE private flag set, to distinguish
    it from real signatured subs.


-- 
The warp engines start playing up a bit, but seem to sort themselves out
after a while without any intervention from boy genius Wesley Crusher.
    -- Things That Never Happen in "Star Trek" #17

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About