Front page | perl.perl5.porters |
Postings from November 2003
dealing with objects cloning under ithreads
From:
Stas Bekman
Date:
November 14, 2003 01:56
Subject:
dealing with objects cloning under ithreads
Message ID:
3FB4A6D0.50805@stason.org
I need to clone an object which contains a reference to a C struct, so it
won't crash on DESTROY under threads trying to free an already freed by
another thread C struct. I'm trying to figure out how CLONE is supposed to
work, but can't find any examples. Looking at the source code:
/* Call the ->CLONE method, if it exists, for each of the stashes
identified by sv_dup() above.
*/
while(av_len(param->stashes) != -1) {
HV* stash = (HV*) av_shift(param->stashes);
GV* cloner = gv_fetchmethod_autoload(stash, "CLONE", 0);
if (cloner && GvCV(cloner)) {
dSP;
ENTER;
SAVETMPS;
PUSHMARK(SP);
XPUSHs(sv_2mortal(newSVpv(HvNAME(stash), 0)));
PUTBACK;
call_sv((SV*)GvCV(cloner), G_DISCARD);
FREETMPS;
LEAVE;
}
}
it looks for the function CLONE in the current stash and if found calls it
with that stash name as the only argument. Indeed, this test program when it
doesn't fail with an error "A thread exited while 2 threads were running.",
which happens randomly:
package main;
use threads;
sub CLONE {
my $stash_name = shift;
my %stash = %{"$stash_name\::"};
print "CLONE was called,\n keys: ", join " ", keys %stash;
};
$obj = 1;
threads->new(sub { })->detach;
we get:
% perl-5.8.2-ithread /tmp/clone
CLONE was called,
keys: async ARGV BEGIN AutoLoader:: @ _<universal.c STDIN vars:: Mac:: 0
<none>:: obj VMS:: CORE:: ▒ ! ARNING_BITS warnings:: CLONE Internals:: ENV
stdin / threads:: Scalar:: _<threads.c main:: _<xsutils.c _<perlio.c stderr
SIG overload:: _<DynaLoader.c INC - Carp:: strict:: DB:: stdout Config:: + ]
IO:: __ANON__ _<perlmain.c 1 attributes:: PerlIO:: " Regexp:: $ DynaLoader:: _
STDOUT UNIVERSAL:: STDERR Exporter::
_</home/stas/perl/5.8.1-ithread/lib/5.8.1/i686-linux-thread-multi/auto/threads/threads.so
utf8::/
Now let's say I have a class Foo [2] which creates objects with C struct guts
in it. Then the the user program becomes:
package main;
use threads;
$obj = Foo->new();
threads->new(sub { })->detach;
And here I'm lost. How does the user program know that it needs to call
Foo::CLONE_OBJ (or whatever the cloning function is) on $obj? You don't expect
users to write code to traverse the stash, look for the objects belonging to
class Foo and call $obj->CLONE_OBJ on it. I suppose Foo could export function
CLONE into the user namespace, so that function will traverse the stash and
clone its own objects. But what happens if there are two classes used by the
user program and both need cloning?
perlmod.pod has very little to say about it, but:
If you want to CLONE all objects you will need to keep track of them per
package. This is simply done using a hash and Scalar::Util::weaken().
Does it suggest that class Foo need to maintain an internal storage (e.g.
hash) of all objects it creates, weakened so not to interfere with normal
DESTROY, and have Foo::CLONE explicitly clone all objects in that storage.
Like so:
package Foo;
use Scalar::Util;
my %objects = ();
sub new {
# let's say that new_data() returns a C struct pointer
my $data = new_data();
my $self = bless $data, __PACKAGE__;
my $copy = $self;
Scalar::Util::weaken($copy);
$objects{"$data"} = $copy;
return $self;
}
sub CLONE {
while (my($k, $v) = each %objects) {
# replace the guts of the object with new $data
# this affects the object in the user program as well
$$v = $v->clone_obj;
}
}
sub DESTROY {
my $self = shift;
# some normal DESTROY code
# ...
# cleanup the storage
my $data = $$self;
delete $packages{"$data"}; # remove from storage
}
sub clone_obj {
my $self = shift;
my $data = $$self;
return clone_data($data); # some XS function that does the cloning
}
does this seem the way to go? this code is untested.
Ideally it'd be nice to just have a CLONE method to be invoked on any cloned
object. I suppose it wasn't added because of the huge overhead it'll add to an
already deadly slow [1] perl_clone to check every scalar whether it can('CLONE')?
References:
1) at the moment perl_clone seems to be an almost impractical thing in a real
world if you have lots of modules loaded. On an unloaded CPU in the mod_perl
2.0 test suite (About 150 small packages) it takes at least 5 seconds to run a
single perl_clone. So to start threaded mpm Apache with 12 interpreters will
take about a minute and longer. If you attempt start 120 interpreters you will
have to wait at least 10 minutes. Heh.
2) here is an example of a trivial program which will either hang or segfault
depending on your malloc implementation:
package main;
use threads;
use GTop;
$Foo::gtop = GTop->new;
threads->new(sub { print "thread started\n";})->detach for (1..3);
here all 3 threads get a copy $Foo::gtop, and then try to destroy it (4
times). Only the first one does it successfully, as gtop returns an object
which is a wrapper around a C struct. If you don't have GTop, replace it with
some other class which returns a similar object and you will see what the
problem is.
Oh, the joy of threads.
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com
-
dealing with objects cloning under ithreads
by Stas Bekman