develooper Front page | perl.perl5.porters | Postings from February 2003

[PATCH] jumbo closure fix

Thread Next
Dave Mitchell
February 26, 2003 06:53
[PATCH] jumbo closure fix
Message ID:
[ requires regen_headers ]

This patch fixes all the major outstanding closure bugs that I am aware
of (well, apart from ones pertaining to /(?{...})/ ).
I've achieved this by completely rewriting pad_findlex() from scratch,
so effectively re-implementing closures from a blank sheet.

The most notable differences are.

* named subs now close on behalf of inner subs; eg the following now
prints 1 rather than undef:

	my $x = 1;
	sub f { sub { print $x }->() }

* run-time cloning is now a lot faster, eg $a = sub {$x} is 15% faster,
$a = sub {$x+$y} is 25% faster. This is because at compile time, the index
into the parent pad is recorded for each outer lex, so cloning just
involves grabbing each approriate value from the parent pad rather than
calling pad_findlex() for each outer lex.

* warnings are changed and more pervasive:

the warning 'Variable \"%s\" may be unavailable' is now the more assertive
'Variable \"%s\" is not available', and some cases that formerly caused
this warning now cause 'Variable \"%s\" will not stay shared' instead, and
vice versa. Several cases that formerly quietly did something strange
(usually involving a mysterious shared undef value), now give the 'is not
available' warning.

* There's a new global variable PL_cv_has_eval, that gets set during
compilation if any eval-like constucts are found within the CV's ops.

Will this patch break existing code? Quite possibly, although the things it
handles differently are often things that would have generated a warning
anyway.  The main problem is if there is code that tests for a particular
warning, since some of the warnings have changed. In general, dodgy
constructs that used to silently fail, are likely to generate warnings

I've tested this aginst my production code at work (which relies on half
of CPAN AFAIKT :-), and it didn't seem to break anything, so there's hope

(If I'd known when I started work in this patch in late Decemeber, that it
would take me till the end of Feb, I might not have bothered!)


Here are the full details.

(You don't have to real all this - this is mainly for posterity, so
I can remember why on earth I did things the way I did :-)

Anon subs now strictly capture only at creation (aka run) time rather than
compile time; formerly it was a mixture of the two, leading to problems
with objects not getting freed soon enough.

Anon prototupes are no longer recursively cloned: formerly, when cloning
the outer of these two subs: sub { sub {...} }, the inner sub prototype
was cloned at the same time, then when the inner sub was cloned, it would
be cloned from the (cloned) prototype. Now, cloning the outer sub doesn't
do anything else. The recursion used to be required to ensure that the
CvOUTSIDE of inner prototypes pointed to the right outer anon; now this
is determined at clone time by calling find_runcv().

This doesn't work well in the presence of non-closure (shared) anon
prototypes that contain evals, for example:

    my $x; sub { sub eval '$x'}->() }->()

In this case, since the inner sub is shared rather than cloned, it never
uses find_runcv() to stop it's CvOUTSIDE from pointing to the outer
prototype rather than the outer cloned sub. To avoid this, we turn on
the CvCLONE flag for any anon prototypes that may have any eval-capable
construct in any nested scope. This includes eval '', //ee, /(?{..})/
and /$var/. We do this by introducing a new global var PL_cv_has_eval,
which is set to zero at the start of every CV compilation, and it set to 1
by any construct that may have eval-like behaviour. At the end of compiling
the CV, we test this flag, and if true, follow up the chain of CVs
marking any anon's as cloneable. We also do this if running with -d,
since the debugger can excecute an eval anywhere.

Since a previous patch of mine freed up the SvIVX and SvNVX fields of
SvFAKE namesvs, I have now put them to new use. IVX stores two flags
giving information about the real lex referred to by the fake: is it
declared in an anon sub, and is it capable of having multiple instances?
The NVX is used for anon prototypes to record the index in the parent
pad of where the lexical can be captured at clone time.

The code now strictly follows the idea that 'subs capture their lexical
context at creation time', where for most CV types (named subs, evals
etc), creation time is the same as compilation time, while for anon subs,
creation time is when the 'sub' operator is executed.

The use of a lexical in any inner sub now treats all intermediate subs
as also using the lexical, eg the following have the same effect

    sub f1 { my $x;  sub {     sub f2 {     sub { $x } } } }
    sub f1 { my $x;  sub { $x; sub f2 { $x; sub { $x } } } }

All of these inner subs will either capture the current instance of $my
at compile time (named subs, evals etc), or note the index in the parent
pad for later capture (anon subs).

Eval-like actions now distinguish between CVs still being compiled and
Cvs that have finished compilation. In the latter case, pad_findlex()
will still follow up the chain of CVs trying to capture the current
instance of a lexical, but will not add any fake entries to
already-compiled pads. for example in

    my $x; sub f { eval 'sub g { $x }' },

during the eval compilation, a fake entry will be added to g's pad but not
f's. G will capture the current value of $x, or if it is not currently in
scope, a 'variable is not available' warning will be issued, and a fresh
undef SV will be used instead. (This makes use of the SvPADSTALE flag).
Not adding fakes to a compiled pad means that the debugger no longer has
side-effects when getting the current values of vars (which it does via an
eval with a funny scope). It also avoids some problems where a non-closure
anon sub suddenly gets promoted into a closure prototype.

The 'Variable \"%s\" may be unavailable' warning used to be restricted
specificially to the following form:

    sub f1 { my $x; sub f2 { sub {$x} } }

and, depending on whether f2() was called from f1(), or called when f1()
was inactive, the anon sub might or might not have captured the current
value of $x - hence the 'may' part of the warning.  The old behaviour was
that at clone time, pad_findlex() would search up the chain of CVs looking
for a variable to capture. Now instead, f2 captures the first instance of
f1's $x (and so a 'will not remain shared' warning is issued instead),
then the anon captures $x from $x's pad, so it will always capture the
first instance of f1's $x, regardless.

The 'is not available' message now refers to a) at compile time when
trying to capture from an anon prototype; since anons are now strictly
only created at run-time, eg

    sub { my $x; sub f {$x} }

would give a compile-time 'is not available' warning, and b) at run time
(usually via eval), where the thing being captured isn't in scope, eg

    sub f { my $x; sub { eval '$x' } }


    sub { my $x; sub f { eval '$x' } }

You live and learn (although usually you just live).

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About