develooper Front page | perl.perl5.porters | Postings from June 2014

Re: list/pushmark optimization

Thread Previous | Thread Next
From:
Dave Mitchell
Date:
June 28, 2014 10:14
Subject:
Re: list/pushmark optimization
Message ID:
20140628101352.GA23773@iabyn.com
On Mon, Feb 24, 2014 at 04:12:05PM +0000, Nicholas Clark wrote:
> On Mon, Feb 24, 2014 at 04:54:12PM +0100, Steffen Mueller wrote:
> > On 02/24/2014 04:25 PM, Dave Mitchell wrote:
> 
> > > I think I made a vague proposal a while ago that during at at least some
> > > parts of compilation (but probably not afterwards), we could change the op
> > > tree so that the last sibling in a chain was indicated by setting an op
> > > flag rather than having a null op->sibling. That last sibling's null
> > > op_sibling field could then point back to the parent instead, so that this
> > > relation holds:
> > >
> > >      cLISTOPx(o)->op_last->op_sibling == o
> > >
> > > This would allow you to always be able to find an op's parent, and would
> > > also make it possible to do a scan of the op-tree (like scalarvoid() etc)
> > > without needing to recurse.
> > 
> > That's an interesting idea, but it's going to violate A LOT of code's 
> > assumptions about op_sibling being always NULL at the end of the list. 
> > Both inside and outside the core. Variants of for (kid = op->op_first; 
> > kid; kid = kid->op_sibling) are really common in anything OP-tree related!
> 
> I think "A LOT" is about 25, given
> http://grep.cpan.me/?q=for.*-%3Eop_sibling
> 
> That's tractable to patch.
> 
> And even if it's *anything* involving op_sibling, that's only 59
> distributions of the 29,058 on CPAN. We've got 89 shipped in the core in
> cpan/ alone.

I have just pushed the branch smoke-me/davem/sibling, which:

* adds accessor macros OP_SIBLING(o), OP_SIBLING_set(o) for the op_sibling
  field;

* adds a new general-purpose op tree manipulation function,
  op_sibling_splice(), which by analogy with perl's splice() function,
  allows you to cut zero or more ops from an op_sibling chain, and replace
  them with zero or more different ops, while correctly maintaining the
  op_first, op_last etc pointers of the parent node;

* heavily reworks op.c (and any other bits of core requiring it) to use
  the OP_SIBLING() macros and the op_sibling_splice() function where
  possible; this replaces most ad-hoc op tree manipulation in op.c;

* adds a new flag, op_lastsib, which in core at least, is set on the last
  op in each op_sibling chain, and not set elsewhere;

* adds a new build define -DPERL_OP_PARENT (off by default) that
  implements the scheme I proposed earlier, i.e. it detects the last
  sibling by using the op_lastsib flag rather than via op_sibling==NULL,
  and sets op_sibling on the last sibling node to point to its parent;
  this changes the definition of the  OP_SIBLING() macro to: test op_lastsib
  and return NULL if true, otherwise return op_silbing;

* adds a C-level Perl_op_parent() function and B::OP::parent method that
  returns an op's parent (or in the absence of PERL_OP_PARENT, just returns
  NULL)

As I see it, the next steps are:

* once it smokes ok, and unless anyone violently objects, merge into blead;

* then get PPPort to support OP_SIBLING(), op_sibling_splice() etc on
  older perls;

* then liaise with module authors who do direct op tree manipulation at the
  C level about making their code work in the new regime;

* Once enough time has passed and most modules are fixed, make
  PERL_OP_PARENT the default.

In terms of fixing up XS code: if the code just examines but doesn't
modify the op tree, then its really just a case of sprinkling OP_SIBLING()
about; for example

    for (kid = cUNOPo->op_first; kid; kid = kid->op_sibling)

becomes

    for (kid = cUNOPo->op_first; kid; kid = OP_SIBLING(kid))

For code that modifies op trees: if it's just done via existing core
functions,  eg newBINOP(..., node1, node2); then nothing needs changing.
If the code directly modifies op_first, op_last and op_sibling pointers,
then it needs fixing. Usually the simplest way to to this is to replace
the direct manipulation of those pointers with calls to
op_sibling_splice(); once that's done, everything should just work
automatically. This is the route I took to fix up most of op.c. If that
isn't suitable, the module author should probably liaise with me and see
if anything needs adding to the API. If all else fails, the author will
have to directly handle the correct setting and updating of op_sibling,
op_lastsib and op_last.

For code that accesses the op tree at the perl level using B, nothing needs
changing (except that you will now have the ->parent method to make your
life easier).

Note that for a perl compiled *without* the PERL_OP_PARENT define, no XS
modules need immediate fixing, apart from potentially on debugging builds,
where I've added some stricter checking that the op tree structure is
consistent, e.g. that op_last does in fact point at the last sibling. If
XS code produces bad op trees, then asserts will start triggering with this
branch. It also fixes up a few existing core op tree nodes where op_last
wasn't right.


-- 
Hofstadter's Law: It always takes longer than you expect, even when you
take into account Hofstadter's Law.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About