develooper Front page | perl.perl5.porters | Postings from March 2015


Thread Previous | Thread Next
Dave Mitchell
March 5, 2015 17:22
Message ID:
On Wed, Mar 04, 2015 at 08:16:06PM +0000, Zefram wrote:
> Your multideref and signature ops both have embedded opcode-based
> sublanguages, and this gets me a bit concerned.  It seems to be an
> inner-platform situation.  It's not totally outrageous to have such a
> thing -- after all, regexps already work by such a sublanguage.  But I
> think the impetus to create an inner platform should always be suspect.

Yes, I'm not keen on them, but in a few select cases they can make a
major performance difference not achievable by any other method.

> What's bad about the opcode system we've already got that means it has to
> be supplanted?  (In this case, presumably the performance implications
> of the runloop.)

Discrete opcodes lose in 4 big ways:

* the overhead of the runops loop and calling the pp function.

* the execution pipeline stall where the CPU doesn't know what pp function
  is going to be called next (the o = o->next; (o->op_ppaddr)(aTHX) bit),
  although to be fair, switch statements in a loop can suffer from a
  similar issue.

* passing partial values (as SVs) between ops by pushing them on and off
  the stack; as well as the code overhead, that a whole bunch of extra
  reads and writes;

* variable values and state that can't be maintained across ops. For
  example in pp_signature, there are 3 major local vars:

    UV   argc;        /* scalar(@_) */
    SV **argp;        /* current position in @_'s AvARRAY */
    SV **padp;        /* pad slot for current var */

  and the core of the main loop in that function is:

    while (argc--) {
        sv_setsv(*padp++, *argp++);

  (plus a lot of extra complication to do with default args, placeholder
  params etc)
  Once argp etc get initialised, they are just there, probably in a
  register, for immediate use. With separate ops, every op has to do:

        defav = GvAV(PL_defgv);
        argc = AvFILLp(defav) + 1;
        argp = AvARRAY(defav);
        padp = &PL_curcup[o->op_targ];

  or similar.

> Could we fix those things in the existing framework,
> without punting to an inner platform?  (Op chaining could probably
> be made cheaper for some classes of op.)  If there must be an inner
> platform, how well can we shift things between the two platform levels?
> (With great difficulty, in these cases.  Utility functions could make
> the transformations easier, but it's painfully clear that they're not
> designed for this sort of manipulation.)

If someone can come up with a viable proposal, I'm all ears.

> >The specification of OP_SIGNATURE is 'assign the passed stack items to
> >the signature variables with default handling'.
> You're glossing over a bunch of visible details there.  It's still way
> more complicated than entersub.

Have you actually looked at entersub??? The compiled object size of
pp_entersub is about twice that of pp_signature. It's immensely

> Slight tangent: how would you feel about spinning off your optimisations
> (independent of what happens around signatures) into op types that aren't
> at all aimed at signatures?

I already have on my TODO list to look at sv_setsv() to see if it can
special-case IV and RV assignment better like pp_signature does.

> For example, you have an optimised version
> of assigning an IV to a scalar, which is particularly efficient where
> the scalar is SVt_IV and non-magical.  There are a lot of "$x = 1;"
> type statements in Perl code, for which those fast-path preconditions
> are usually true, so it's potentially a big win to use the optimised
> code for them.  Even where the preconditions are false, there's a
> side benefit in reducing the three ops (const, padsv, sassign) to one.
> I'd think this one probably worth the API complexification.

Some of this might get caught by a tweaked sv_setsv(), but having
ops that assign a small integer value (perhaps stored in op_private)
to a lex or scalar var might be a win.

> >So, ignoring specific optimisations which could be easily reverted,
> >I don't think OP_SIGNATURE is semantically visible.
> It is not those constraints alone that make the op type semantically
> visible.  They do add to its visibility, and in particular the way you
> applied them makes the op type visible in places that it shouldn't be.
> But that's the icing on the cake of its visibility.  Separating out the
> two issues that you said I was conflating, my original statement that
> "the op type is somewhat semantically visible" was actually expressing
> the view that the op type is semantically partially visible per se,
> independent of its implementation limits.  I didn't expand on that,
> because I thought it obvious enough.

It's not at all obvious to me. Perhaps that's because you see the details
of perl's OPs and optree structure as part of perl's API, while I see it
as internal implementation detail, freely changeable between releases.
More on that below.

> The interesting one is the arity
> limit.  You raise the issue of whether it's a good tradeoff, and from
> the point of view of designing optimised op types I think it probably
> *is* a good tradeoff.  Specifically, saving a couple of bytes from
> every instance of the op type is probably worth more than being able
> to apply its optimisation to very long signatures.

It's actually probably 8 bytes per sub, and stored in such a position
in the op_aux struct that it might make all the data needed for a
simple sub no longer fix in a single cache line (its currently 32 bytes on
a 64-bit system).

> The big problem is
> that you didn't make a choice on this tradeoff spectrum: instead you
> weighed up that couple of bytes per op against being able to compile
> very long signatures at all.  The problem is that you tied the parsing of
> signatures directly to this optimised op, to the point that you lost the
> ability to compile signatures that don't qualify for this optimisation.
> At this point the signature op ceases to be an optimisation, and becomes
> the definitive implementation of signatures.
> Op types with arbitrary limits are OK as optimisations, but it is
> wrong to apply those arbitrary limits to the language as a whole.
> Like the deparser with :proto syntax, the coupling is wrong here: you
> coupled the language feature of signatures too closely to the optimised
> implementation.

I made a conscious choice that limiting the number of args to 32767
was a price worth paying (but specifically mentioned it in the commit
message in case anyone disagreed). The coupling is necessary. You would
like for Perl_parse_subsignature() to generate a vanilla optree as before,
then for code in peep() or similar to reduce that subtree to an
OP_SIGNATURE if appropriate. Which would of course remove the limit.
But this this approach has two major drawbacks.

First, it is really hard to write C code to scan the optree looking for a
particular matching subtree, without avoiding false positives and
negatives. And for both of them it can be hard to spot the error: for a
false negative, it means that the optimisation quietly gets turned
off and lots of code runs slowly; and for a false positive, some subtle
difference in behaviour occurs (because of a new op_private flag bit for
some op that pp_signature doesn't know about). Of course the spotting code
could be written to spot exactly that which is emitted by
parse_subsignature(), but that would kind of defeat the purpose.

For example, S_maybe_op_signature() which spots and converts the "simple"
optree generated for 'my (....)= @_;' is around 200 lines of code, while
S_maybe_multideref() which does the same for the conceptually simple
->[const]{$lexvar}[$pkgvar] style dereferencing constructs, is around
750 lines. The OP_MULTIDEREF optimisation took me around 150 hours to
write, and a large chunk of that was related to difficulties in scanning
and manipulating the op tree.

Secondly, it wastes a *lot* of memory. These days ops are allocated in
per-sub slabs, and in general, ops that are freed late in the optimisation
of a sub (e.g. during peep()), just leave wasted holes in those slabs and
can't be freed or reused. A simple sub f ($a, $b, $c) {} compiles
to 50 ops, which would then be reduced to 5 ops. Those 45 extra ops
would waste in the region of 2K+ bytes per sub. 

> >This I don't understand. There is nothing to stop a hypothetical plugin
> >either emitting a series of "plain" ops instead of an OP_SIGNATURE, or
> >later replacing the OP_SIGNATURE (depending on what point in the
> >compilation process it is called at).
> Once again you're thinking in terms of a plugin replacing the whole of the
> signature syntax, which is a trivial and uninteresting case.  My paragraph
> to which you're responding here was concerned with plugins replacing a
> small part of a signature that is otherwise parsed by the core mechanism.

No I explicitly said elsewhere in my email that hypothetically,
parse_subsignature() could on entry check whether hooks are enabled,
and if so emit a "zefram-style" optree and call per-arg hooks or whatever
are desired, and only emit OP_SIGNATURE in the fallback case of no hooks.

[ Just as an aside, I am opposed to such hooks; my point above is not that
they should be added, just that nothing in my work precludes them from
being added later if a consensus disagreed with me. ]

> >If people aren't concerned with performance (due to that wonderful
> >ever-cheaper hardware (which has been stuck at 3Ghz for several years
> >now)), while wanting maximum flexibility, plugability etc, then again I
> >suggest that we point them to perl6. 
> So your position is that Perl 5 is, or should be, stable^Wdead?

Yes, absolutely!

For example, the B suite was added to perl around 17 years ago. What has
it achieved in that time? Can you name one major language extension to
perl that is in common production use that things like B and other
internals hooks helped enable? (Ok, there are a few minor ones, like scope
hooks, but these could have been implemented directly in core with a whole
bunch less effort than maintaining the whole B ecosystem.)

The main things that B et al have helped achieve is to ossify the
internals; we are now constrained and pinned down at every turn,
because any change to any op, any change to any optree arrangement,
will show up in a BBC somewhere, and provoke a discussion. Hell, when we
removed the unneeded op_seq field several years ago, that was enough for
Reini to declare that p5p had irretrievably broken perl.

As an example, I'd like to make a simple change to rpeep()'s signature -
so it returns an updated start op, but it can't be done now because rpeep
is API and hooked. So instead everywhere where I do optimisations I have
to work round it, or skip optimisations starting on the first op in a

Also, every op/optree related internals change now involves fixups to B
and Deparse, so every change is that much more work.

If I could travel back in time and stop Malcolm B. writing B and friends, I
would in an instant. Perl now would have been far, far better, and
probably a lot more truly extensible than it is now.

The realisation that B (and B::C etc) were a failed experiment was one of
the drivers of perl6. It's been an albatross round perl5's neck ever

> >Think of it conceptually that Perl_parse_subsignature() produces an
> >optree, which under certain (but very common) situations can then optimised
> >into a single OP_SIGNATURE op. It just so happens that the current
> >implementation, knowing that at the moment the optree can *always* be
> >reduced, just skips the optree generation and creates the OP_SIGNATURE
> >directly,
> Except that that's not what you've implemented.  Aside from the actual
> nature of the optimised op, the first half of this is what I've been
> proposing, and the second half wouldn't be terrible.

The second half is exactly what I've implemented: parse_subsignature()
"skips the optree generation and creates the OP_SIGNATURE directly"

The first half I'm opposed to for the reasons I enumerated earlier.

> For op manipulability and syntax pluggability, it is way more important
> to get the optimised ops generated from explicit ops than to get any
> optimised direct generation of them in the signature parser.

For avoiding 2K waste per sub, avoiding subtle peep() bugs and avoiding
requiring loads more of my time to implement the peep converter(), it is
way more important to get a working optimisation implementation now than
to allow for some future hypothetical (and unlikely ever to be actually
implemented) signature plugin system.

NB the only three signature enhancements that I can think of are:

    * types and attributes for lex vars, e.g. my int $foo :bar(...)
    * aliases, i.e. the equivalent of perl6's "is rw".

Types and aliases could be trivially implemented in core as long as a
syntax was agreed upon; I don't know enough about attributes to form an
opinion on them.

So I'm dubious about the merits of having pluggable signature.

> By virtue of having been crafted very precisely for the specific needs
> of the signature feature, your signature op is liable to remain at least
> slightly more runtime efficient than anything that arises by other routes.
> But from the context and the "*much*" you seem to be assuming here
> not just some gain from the specificity but also that no other system
> of op types would succeed in handling multiple parameters in one op.
> That's not justified.

As I think I've explained in enough detail here and in other emails,
there is very good reason to suppose that OP_SIGNATURE is way more
efficient, and the onus is now on others to provide a counter-example.

> The padrange op stands as an example of the sort of thing that can
> be done, and indeed the existing padrange op could have a role in
> the optimisation of signature-like code.  We can surely manage to
> peephole-optimise "my $x = $_[0]; my $y = $_[1];" into a padrange+aassign.
> I wouldn't object to a variant of padrange also taking over the actual
> assignment when the RHS is @_ (or perhaps any simple array), in which
> case it can copy straight from the array to the pad variables without
> putting anything on the stack.  As with the existing padrange op, these
> optimisations would help code that's not derived from signatures.

Yes, but such a hypothetical op can't check for valid arg counts,
can't handle placeholders (my ($a, undef, $b) = @foo), can't handle
non-contiguous pad ranges, and can't handle default args. As soon as you
throw any of those into the mix, it becomes slow again.

On the other hand, I'm not opposed in principle to the idea of extending
OP_SIGNATURE so that it it could handle a general my (...) = @array
statement (for example a flag indicating that it should get the AV off the
stack rather than from GvAV(PL_defgv)). Potentially (but much more
hand-wavey at this point) its default args facility could be used to
handle statments like my(...) = (@array, 1, "foo") where the 1 etc are
treated as default values and expressions.

In which case arguably OP_SIGNATURE could renamed to something more

But I think that the best approach is to have an op that can handle the
full (or most) of the functionality of a signature (and the functionality
of an initial my(...) = @_) which can be extended to also handle some
common my (...) = @array syntaxes too, than to have an op that can handle
some common my (...) = @array syntaxes, but can't handle signatures.

[ Just as an aside to myself while I remember, the existing signatures
implementation subtly different from my (...) = (@_, default, expressions),
in that with
    my $a = $_[0];
    my $b = $_[1];
    my $c = @_ >= 3 ? $_[2] : <expression1>;
    my $d = @_ >= 4 ? $_[3] : <expression2>;
if expression1 is tainted, it doesn't taint $d, while the direct list
assignment would. It might have other implications too in terms of the
timing of the default expressions and whether they actually get executed).

> >Also, in terms of the criticism that OP_SIGNATURE isn't general purpose,
> >I think you'll find that the "push_arg" op is highly specific too; I think
> >it would be fairly rare to find perl code of the form '@_ >= N+1 ? $_[N] :
> >...' in the wild
> Interesting grep.  There's a bunch of code doing equivalent things that
> wouldn't precisely match that pattern; for example I've got some code
> that does "push @_, 0 if @_ == 1".

That code isn't equivalent at all.

> >hypothetically use the op_aux data to recreate a set of "normal" ops to
> >replace or augment the OP_SIGNATURE if required.
> An exposed utility function that recreates the normal ops would help.

I'm not opposed to that in principle. I would need to think about it some
more to see if there are any practical difficulties.

> >* Another specific optimisation restricts the number of params a sub can
> >  have to 32767. I can easily increase this limit to 2**31-1 at the
> >  expense of requiring an extra U32 of storage in the op_aux of each
> >  OP_SIGNATURE on 32-bit platforms. I am minded to do this.
> If your objective is to handle any possible arity, then 2^31-1 is
> the wrong new limit.  You should base the limit on the STRLEN type or
> something else that reflects address space size.  But as I discussed
> above, that's only needed if the signature op needs to handle every
> possible signature, which it does if there's no other implementation
> of the signature syntax.  If, as I would prefer, the signature op is
> purely an optimisation of things that are principally expressed in a
> more general form, then the 2^15-1 limit is fine.

Assuming there's no fallback to OP_SIGNATURE, I would have no qualms
whatsoever of releasing a perl with a 4G signature limit if it gave us an
advantage elsewhere.

Music lesson: a symbiotic relationship whereby a pupil's embellishments
concerning the amount of practice performed since the last lesson are
rewarded with embellishments from the teacher concerning the pupil's
progress over the corresponding period.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About