At 11:18 AM -0500 1/24/05, Eric Garland wrote: >I'm setting up a Boss/Worker threaded program that runs for a very >long time. There are times where a worker thread will run into >errors and exit. The obvios solution would be to prevent that from >happening but I would like to create a fault tolerant framework that >doesn't tip over at the slightest hint of a problem. I would like >to have the Boss detect when a worker exits and restart a worker >thread in its place. > >So far, I find no functions that will determine if a thread has >exited outside of join(). The obvious issue with join is that it >blocks so I need a join thread for each worker thread so that I can >send a message back to the Boss to restart the thread. This seems >to wildly increase the memory usage. There are large shared data >structures in this program and it already seems to be at the memory >limits of the machine without these join threads. You might want to have a look at Thread::Running on CPAN, by yours truly. >Is there a way to remove the shared variables from the join threads >so that they take up as little memory as possible? Not sure what you mean by that. >Better yet, is there a way to detect thread deaths outside of a >dedicated join thread, perhaps a $thread->ready_for_join() type >function or a $thread->join_non_blocking() or even a >$thread->join_any_of_these(@threads)? From the pod of Thread::Running: NAME Thread::Running - provide non-blocking check whether threads are run- ning SYNOPSIS use Thread::Running; # exports running(), exited() and tojoin() use Thread::Running qw(running); # only exports running() use Thread::Running (); # threads methods only my $thread = threads->new( sub { whatever } ); while ($thread->running) { # do your stuff } $_->join foreach threads->tojoin; until (threads->exited( $tid )) { # do your stuff } sleep 1 while threads->running; Hope this is what you're looking for. LizThread Previous | Thread Next