develooper Front page | perl.perl5.porters | Postings from February 2012

NWCLARK TPF grant report #24

Nicholas Clark
February 22, 2012 07:20
NWCLARK TPF grant report #24
Message ID:
[Hours]		[Activity]
2012/02/13	Monday
 0.50		RT #110248
 1.00		perlfunc
 0.25		process, scalability, mentoring
 5.00		reading/responding to list mail

2012/02/14	Tuesday
 0.50		ExtUtils::MakeMaker::MM_Unix::perldepend
 1.75		Pod::Functions
 1.00		Pod::Html
 0.25		RT #110736
 0.50		continue
 1.00		euid
 0.25		perlfunc
 4.75		reading/responding to list mail

2012/02/15	Wednesday
 0.50		defined and exists
 5.75		reading/responding to list mail
 0.50		t/porting/pending-author.t

2012/02/16	Thursday
 7.00		another backref panic

2012/02/17	Friday
 0.75		Pod::Functions
 1.50		another backref panic
 4.25		reading/responding to list mail

2012/02/18	Saturday
 0.50		Data::Dumper

2012/02/19	Sunday
 0.75		arrays
 1.00		perlfunc
 0.25		process, scalability, mentoring

Which I calculate is 40.00 hours

Zombie undead global backref destruction panics strike *again*. I don't know
what it is with these critters, but another variant turned up, in almost the
same place, and triggered by almost the same code. I *think* that this is
the last one, simply because this was the third and final branch of the
backref code, and now it's been diagnosed and fixed.

In this case, the problem is that it's possible for the the last (strong)
reference to the target [tsv in the code in Perl_sv_del_backref()] to have
become freed *before* the last thing holding a weak reference. If both
survive longer than the backreferences array (the previous cause of
problems), then when the referent's reference count drops to 0 and it is
freed, it's not able to chase the backreferences, so those backreferences
aren't NULLed.

For example, a CV holds a weak reference to its stash. If both the CV and
the stash survive longer than the backreferences array, and the CV gets
picked for the SvBREAK() treatment first, *and* it turns out that the stash
is only being kept alive because of an our variable in the pad of the CV,
then midway during CV destruction the stash gets freed, but CvSTASH() isn't
set to NULL. It ends up pointing to the freed HV. Hence that pointer is
chased into Perl_sv_del_backref(), but because it's pointing to a freed HV
the relevant magic structure was no longer there to be found, a NULL pointer
was assigned to a local variable. Subsequent code panicked because it
thought that could never happen, at least not without a bug. Except, as the
investigation showed, it could happen quite legitimately in exactly this
scenario. During global destruction, all bets are off.

I don't believe that "better" destruction ordering is going to help here -
during global destruction there's always going to be the chance that
something goes out of order. We've tried to make it foolproof before, and it
only resulted in evolutionary pressure on fools. Which made us look foolish
for our hubris. :-(

I think that the reason that all these critters are shuffling towards us
*now*, despite being in code that's quite long lived, is because since
5.14.0 Dave has re-worked the SV destruction code. Previously it would
recurse into data structures, which had the unpleasant side effect of
blowing the C stack when it tried to do too much at once. [Crash and burn -
not good] Dave has made much of that code iterative now, which avoids the
crashing. [Generally this is seen as progress :-)] However, it's changed the
destruction order, and I think in some cases that is exposing long-latent
bugs elsewhere in the code that destruction calls, particularly during
global destruction.

Nicholas Clark Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About