George Greer wrote on 2009-09-29: > On Mon, 28 Sep 2009, George Greer wrote: > >> On Mon, 28 Sep 2009, Steve Hay wrote: >>> I had a lot of trouble in the past with tests hanging in my smokers, >>> and Jerry Hedden kindly came up with a solution: the Test::watchdog() >>> function. >>> >>> That function is called from several test scripts that are liable to >>> hang (a couple in each of: IO, threads, and threads/shared), and >>> should kill them if they hang around too long. >>> >>> Perhaps we need to add watchdog() to a few more test scripts? >> If it happens again, I'll keep track of the tests that are active at >> the time. Doesn't happen often though. Only one I remember from this >> time is the one I canceled. > Ok, maybe I under-estimated "not often"... it did it again. Looks like > the same three suspects from last time: > > ./perl -I.. -MTestInit io/openpid.t > ./perl -I.. -MTestInit io/perlio.t > ./perl -I.. -MTestInit io/perlio_leaks.t > > All three in a 'Wait:Executive' state. > > Last lines in the log: > io/fflush.t ....................................................... ok > io/fs.t ........................................................... ok > io/inplace.t ...................................................... ok > io/iprefix.t ...................................................... ok > io/layers.t ....................................................... ok > io/nargv.t ........................................................ ok > io/open.t ......................................................... ok > > Same flags as last time: -Dusedevel -Duseithreads -DDEBUGGING > > Only activity that Process Monitor shows for the three is a "Thread > Create" followed immediately by "Thread Exit", and it has been 5 minutes > since each test did that their one time. > > Oddly, Process Explorer shows "Process | <Non-existent Process>(916)" in > addition to the three live (stalled) tests. Perhaps the job driver hung > trying to reap a child? I wouldn't expect the other three tests to be > still alive then, though. > > I'll try killing "io/perlio_leaks.t" this time instead of "io/openpid.t" > like last time. Hrm, that didn't seem to do anything; maybe it does > matter which I kill. Ok, "io/perlio.t" next then. Still not budging... > guess it is "io/openpid.t". The "Process" type now has three non- > existent processes: 916 and the two I killed 520, 500. Now to kill 424, > io/openpid.t... The driver reaped the children and started running > again. I think we have a suspect. Ok, http://perl5.git.perl.org/perl.git/commit/9b70911 should stop openpid.t hanging again.Thread Previous