Front page | perl.perl5.porters |
Postings from December 2011
weak references and global destruction
Thread Next
From:
Nicholas Clark
Date:
December 1, 2011 05:06
Subject:
weak references and global destruction
Message ID:
20111201130635.GJ37285@plum.flirble.org
Karl hit a problem when refactoring - t/re/regexp_qr_embed_thr.t
started failing with panic: del_backref during global destruction.
After a lot of digging, it turns out that this isn't a regular expression
problem. It's a general implementation problem with weak references.
I can reduce the problem somewhat - cut down versions of t/re/re_tests
and t/re/regexp.t attached. These "work for me" at commit c23e3ba527e40a56
on x86_64 Linux with threads, but I'm not certain how sensitive they are to
a lot of things, possibly down to number of entries in %ENV
The end result is hitting this croak():
Breakpoint 1, Perl_sv_del_backref (my_perl=0x81a010, tsv=0x96d0a0, sv=0xa0ca50)
at sv.c:5615
5615 Perl_croak(aTHX_ "panic: del_backref");
(gdb) where
#0 Perl_sv_del_backref (my_perl=0x81a010, tsv=0x96d0a0, sv=0xa0ca50)
at sv.c:5615
#1 0x0000000000510cf4 in Perl_sv_clear (my_perl=0x81a010, orig_sv=0xa0ca50)
at sv.c:6110
#2 0x0000000000511d7c in Perl_sv_free2 (my_perl=0x81a010, sv=0xa0ca50)
at sv.c:6445
#3 0x000000000050367b in do_clean_all (my_perl=0x81a010, sv=0xa0ca50)
at sv.c:610
#4 0x0000000000502c72 in S_visit (my_perl=0x81a010,
f=0x5035e4 <do_clean_all>, flags=0, mask=0) at sv.c:420
#5 0x00000000005036af in Perl_sv_clean_all (my_perl=0x81a010) at sv.c:629
#6 0x000000000043e95d in perl_destruct (my_perl=0x81a010) at perl.c:1073
#7 0x000000000041c0a5 in main (argc=4, argv=0x7fffffffe748,
env=0x7fffffffe770) at perlmain.c:131
context being:
if (SvTYPE(*svp) == SVt_PVAV) {
...
}
else {
/* optimisation: only a single backref, stored directly */
if (*svp != sv)
Perl_croak(aTHX_ "panic: del_backref");
*svp = NULL;
}
where:
(gdb) call Perl_sv_dump(my_perl, tsv)
SV = PVHV(0x822e90) at 0x96d0a0
REFCNT = 3
FLAGS = (OOK,SHAREKEYS,CLONEABLE)
ARRAY = 0xa380e0 (0:27, 1:27, 2:9, 4:1)
hash quality = 108.5%
KEYS = 49
FILL = 37
MAX = 63
RITER = -1
EITER = 0x0
NAME = "Data::Dumper"
ENAME = "Data::Dumper"
BACKREFS = 0xa12a40
SV = UNKNOWN(0xff) (0xa3f668) at 0xa12a40
REFCNT = 0
FLAGS = ()
MRO_WHICH = "dfs" (0x5f38e0)
CACHE_GEN = 0x2
PKG_GEN = 0x2e
MRO_LINEAR_CURRENT = 0xa73bd8
SV = UNKNOWN(0xff) (0xa3f6e0) at 0xa73bd8
REFCNT = 0
FLAGS = ()
ISA = 0xa06bf0
SV = PVHV(0x975d00) at 0xa06bf0
REFCNT = 1
FLAGS = (READONLY,SHAREKEYS)
ARRAY = 0x941bc0 (0:5, 1:3)
hash quality = 150.0%
KEYS = 3
FILL = 3
MAX = 7
RITER = -1
EITER = 0x0
(gdb) call Perl_sv_dump(my_perl, sv)
SV = PVGV(0xae1b20) at 0xa0ca50
REFCNT = 0
FLAGS = (BREAK,MULTI)
NAME = "bootstrap"
NAMELEN = 9
GvSTASH = 0x96d0a0 "Data::Dumper"
GP = 0x0
(gdb) call Perl_sv_dump(my_perl, *svp)
SV = UNKNOWN(0xff) (0xa3f668) at 0xa12a40
REFCNT = 0
FLAGS = ()
The code expects *svp to equal sv. It's not. Hence the croak.
The three values are:
tsv: the stash %Data::Dumper::
*svp: its backref pointer
sv: the glob *Data::Dumper::bootstrap
what's supposed happen is that *svp is supposed to be an AV, and one of the
elements of it is the backpointer to *Data::Dumper::bootstrap
what has actually happened is that that AV has already been freed, before
either the stash or the GV, hence the code above assumes that it's the
"optimised" case [avoid using an AV to store one thing], and the sanity
test fails.
How it happen is a bit more fun. At the start of global destruction,
the AV has 2 references. *Both* are from the stash. The intent is
/* A discussion about the backreferences array and its refcount:
*
* The AV holding the backreferences is pointed to either as the mg_obj of
* PERL_MAGIC_backref, or in the specific case of a HV, from the
* xhv_backreferences field. The array is created with a refcount
* of 2. This means that if during global destruction the array gets
* picked on before its parent to have its refcount decremented by the
* random zapper, it won't actually be freed, meaning it's still there for
* when its parent gets freed.
However, %Data::Dumper:: has 3 (legitimate) references. One is from the symbol
table. 2 are from the pads for our variables.
Result - the zapper keeps dropping refcounts by 1 on each iteration, and
the AV gets zapped first. That's not supposed to happen.
Problem is, I think that it's always possible for this to happen as is.
I don't think that bumping the reference count for the backref AV is ever
going to work, as there are various ways that its owner SV can end up with
a higher reference count, and hence outlive it at global destruction, as
this example demonstrates.
I can think of 3 ways to resolve this. The list probably isn't exclusive
0: Eliminate the sanity cross checks at global destruction time.
It's clear that they can't be relied on, and during the
"grand closing down sale, everything must go" I don't think that it
really matters, as everything is going to get freed anyway
1: Switch backreferences to use a private malloc()ed structure, instead of
an AV.
This would probably actually save memory, as the use case here doesn't
require several features of AVs (reference counts, magic, shift off
the front) which cost about 6 pointers-worth of overhead.
[The same is true of (at least) pads]
The downside seems to be that the optimisation of "avoid the array
completely" is lost
2: Add a new step at the start of global destruction to release all weak
references.
Everything is getting freed anyway. Code can't rely on references to
things continuing to exist during global destruction (already), so this
change doesn't break existing invariants. (Or lack thereof)
This might actually be an optimisation, as currently global destruction
cleans up everything "nicely", which means that as it encounters things
which are weak references, it frees them one by one, and that necessitates
tidying up the backreference each time. That's an O(n) search for each,
O(n²) in total for n them. Whereas linearly scanning each backref array
and setting the weak references to &PL_sv_undef would be O(n)
However, I don't think that (2) is actually a complete solution, because
it strikes me that it's possible for destructors fired during global
destruction to actually create new weak references, which then in turn
get released out of order, such that the sanity cross checks are violated.
Hence I'm tempted to think that (0) and (2) together are the right solution
here.
Background to the reference count bump:
http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2003-07/msg00111.html
Commit d99b02a1a9949639 (which applies the above patch)
Commit 0565a181ab920bd3, specifically the mg.c hunk:
diff --git a/mg.c b/mg.c
index 6d71b21..1423df0 100644
--- a/mg.c
+++ b/mg.c
@@ -2040,7 +2040,9 @@ Perl_magic_killbackrefs(pTHX_ SV *sv, MAGIC *mg)
SV **svp = AvARRAY(av);
PERL_UNUSED_ARG(sv);
- if (svp) {
+ /* Not sure why the av can get freed ahead of its sv, but somehow it does
+ in ext/B/t/bytecode.t test 15 (involving print <DATA>) */
+ if (svp && !SvIS_FREED(av)) {
SV *const *const last = svp + AvFILLp(av);
while (svp <= last) {
Nicholas Clark
PS &ithreads::BOOT sets PL_destruct_level to 2 globally. Is this really
correct for the parent interpreter?
Thread Next
-
weak references and global destruction
by Nicholas Clark