develooper Front page | perl.perl5.porters | Postings from March 2015

Re: OP_SIGNATURE

Thread Previous | Thread Next
From:
Zefram
Date:
March 5, 2015 20:29
Subject:
Re: OP_SIGNATURE
Message ID:
20150305202908.GV8710@fysh.org
Dave Mitchell wrote:
>Discrete opcodes lose in 4 big ways:

Thanks for that discussion.  Tackling it in general would seem to
ultimately require llvm or an equivalent, to compile general ops to
native code.  Short of that, I can see how these concerns lead to a
desire to stuff as much functionality as possible into a single op
and pp function.  Following this impetus would foreseeably lead us
to glom together ever larger assemblages of behaviour into single ops.
The difficulty in general, in constructing a static set of mega-op types,
is finding behaviour sufficiently stereotyped to be a productive mega-op.

Signatures happen to be a case where there's a very stereotyped set of
behaviour, involving a lot of ops in the present system, the generation
of which can be readily intercepted at a high level.  This makes it
easy to jump a few steps along that path of progressively larger ops.
The bypassing of the need to construct it from smaller ops provides
a shortcut.

If you're given free reign to invent new op types, can we expect in a
few years that much of what Perl executes would look like OP_SIGNATURE?
That seems to be what's implied by your downer on discrete ops.  While it
has some appeal from a performance point of view -- I do appreciate the
expense of op dispatch, which is part of why I use custom ops to replace
subroutine calls in some of my modules -- I find it a rather bleak picture
with respect to the space of semantic transformations that are presently
possible.  But I don't have a general solution to the expense of ops.

>Have you actually looked at entersub??? The compiled object size of
>pp_entersub is about twice that of pp_signature. It's immensely
>complicated.

Once again, you're confusing specification with implementation.  entersub
is hideously complicated (and expensive) internally, but that's because
it's dealing with a runtime that has a complicated way of handling the
simple concept of subroutine calling.  In your signature op, much of
the complexity comes directly from the conceptual space covered by the
signature syntax.  It's not just a bunch of simple parameter assignments,
it's also doing defaulting, slurpy parameters, and arity checks.  You also
complicate the actual op API with things like the special casing for zero.
It's all complexity that's visible from outside the pp function.

>First, it is really hard to write C code to scan the optree looking for a
>particular matching subtree, without avoiding false positives and
>negatives.

I think you're finding that considerably more difficult than I do.
It certainly is a cumbersome process, though.  A bit of a write-only
effect too: the tree-matching code is much more difficult to read than
to write.

>                                               Of course the spotting code
>could be written to spot exactly that which is emitted by
>parse_subsignature(), but that would kind of defeat the purpose.

Not entirely.  As long as exactly the same optree can be produced
by ordinary code, there's a benefit there in the form of making it
easy to convert the optree between the two forms.  Only a cheaty flag
OPpFOO_GENERATED_BY_SIGNATURE_PARSER would really defeat the purpose.

>Secondly, it wastes a *lot* of memory.

Interesting point.  But your opportunities to avoid generating the
explicit ops corresponding to new mega-ops are going to be limited.
If you do a few more things like multideref, you're going to find that
you want us to get better at freeing ops.  I'd rather we face that issue
head on than duck it.

We can probably shuffle ops at the end of sub compilation, to fully
defragment the slab.  I think we have a handle on all the inter-op links
that we'd need to update.  Defrag into a fresh slab and hey presto,
optimal slab size.

>                          Can you name one major language extension to
>perl that is in common production use that things like B and other
>internals hooks helped enable?

Bit of a woolly criterion there, but Data::Alias?  That specifically
relies on walking optrees during primary compilation.  (Come to think
of it, D:A is even applicable to signatures and would be broken by
the signature op.  Would require an update that includes expanding the
signature op to an explicit optree.  I did not have this case in mind
when originally objecting to the signature op, nor when raising the
issue of ease of expansion.)

>                               (Ok, there are a few minor ones, like scope
>hooks,

I think you're unfairly denigrating the utility of minor extensions.  At
$ork we use the strictures provided by indirect.pm, multidimensional.pm,
and Sub::StrictDecl, each of which prevents bugs that we would otherwise
suffer.  The debugging time we've saved is huge.  (Big codebase,
programmers of varying conscientiousness and skill.)

>       but these could have been implemented directly in core with a whole
>bunch less effort than maintaining the whole B ecosystem.)

I'm doubtful about that effort comparison being correct.  But also,
more on the principle side of things, you are pining for the bad old
monolithic days.  It would be most unfortunate to lose the class of
possibilities represented by multidimensional.pm, where having tracked
down a bug he'd introduced, on the spur of the moment Ilmari coded up
this module to rule out that type of bug ever recurring, and we were able
to apply it to the entire codebase within days.  If that could only have
been done in the core, how long would it take to get it into production?
What would be the probability that that would actually be achieved?
I think marginally-valuable things like Devel::CompiledCalls just
wouldn't happen.

The CPAN ecosystem is of much greater value than the core alone.

>                                                           Hell, when we
>removed the unneeded op_seq field several years ago, that was enough for
>Reini to declare that p5p had irretrievably broken perl.

Well, that's Reini.  We know he gets hyperbolic about these things.

>As an example, I'd like to make a simple change to rpeep()'s signature -
>so it returns an updated start op,

Sounds good, I'd like that.

>                                   but it can't be done now because rpeep
>is API and hooked.

Have you actually floated it before?  I think the gain here is worth
breaking the handful of modules involved.

>NB the only three signature enhancements that I can think of are:
>
>    * types and attributes for lex vars, e.g. my int $foo :bar(...)
>    * aliases, i.e. the equivalent of perl6's "is rw".
>
>Types and aliases could be trivially implemented in core as long as a
>syntax was agreed upon; I don't know enough about attributes to form an
>opinion on them.

Types would require having a type system in core, which we still don't
have any hint of.  That's a lot more than a matter of syntax.

Off the top of my head, other foreseeable signature features:

    * parameter-supplied truth values in lexical variables (as I actually
      implemented, in core form, but wasn't accepted for 5.20)
    * making parameter variables read-only once initialised
    * parameter constraint predicates (sub z ($x where $x > 3) {...})
    * grouped optionality (accept 1 or 3 arguments but not 2)

Should these all go into core?  Only the ones that are "useful enough"?
We've had enough of those debates in the past.  The availability
of particular signature features shouldn't be limited by the core
development process.  Nor should we rule out signature features that
only make sense in the context of some non-core framework module on CPAN.

>Yes, but such a hypothetical op can't check for valid arg counts,
>can't handle placeholders (my ($a, undef, $b) = @foo), can't handle
>non-contiguous pad ranges, and can't handle default args.

You're once again complaining that my example is only an example of part
of the system, rather than a complete set of op designs.

>                                                          As soon as you
>throw any of those into the mix, it becomes slow again.

You seem to have an underlying position that everything that a signature
can generate must be optimised to death, all in one go.  (It looks
a bit as though it implies that the performance of non-signature code
doesn't matter at all, but I know you don't actually hold that opinion.)
It is not necessary that all signature stuff be equally fast, nor that
all the optimisation come at once.

>That code isn't equivalent at all.

In context, it serves the purpose of supplying a default value for
a parameter.  That's the equivalence.  If "@_ >= N+1 ? $_[N] :" were
subject to specific optimisation, I would have written such optimisable
code there instead of mutating @_.

-zefram

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About