Front page | perl.perl5.porters |
Postings from June 2001
[ID 20010629.002] perl segfaults unpredictable with valid code(Cookbook:p570,571 cmd3sel) concerns presumably a race condition between'waitpid' and 'open' and signalhandling
From:
Wengatz Herbert
Date:
June 29, 2001 08:39
Subject:
[ID 20010629.002] perl segfaults unpredictable with valid code(Cookbook:p570,571 cmd3sel) concerns presumably a race condition between'waitpid' and 'open' and signalhandling
Message ID:
3B3CA108.4ABAE99F@mchr2.siemens.de
Hi there!
We have here a rather nasty problem which we could reproduce with different
perl-versions(5.6.1 and 5.00503) and on different unix-like and unix
operating systems (HP-UX 11, Solaris 2.6 (worst), SunOS 4.1.3 and Linux
with Kernel 2.2.14).
We are developing some software for use as administration tools for huge
unix networks, so we have a big interest in our programs to run stable
and predictable (the faintest buglet may end in up in expensive havoc).
We want to create a routine that is able to handle external commands in
a safe way and we want to communicate with it. Thus we took as a basis
the example from the Perl Cookbook (ORA), chapter 16.9, on pages 570 and
571 (cmd3sel).
We extended the example a little bit and at first we noticed a strange
behaviour because sometimes it wrote back the informations of the died
child (which it should) and sometimes it didn't. - Since we have to rely
on what we receive there, we investigated some more and ended up with a
example-script which works mostly the way it should, but sometimes everything
breaks with segmentation violations and even some other unexpected error
messages from inside perl (see below or try on your own).
We also found out that the errors occur more often when the system load is
higher.
Here is our code (we tried to reduce it as much as we could, and you may
quite well recognize the code from cmd3sel):
----------->8--- cut here ----8<--------
#!/usr/local/bin/perl -T -w
use IO::Select;
use IPC::Open3;
delete @ENV{qw{IFS CDPATH ENV BASH_ENV PATH}};
# repeat 500 times to really show the effect
for($i = 0 ; $i < 500 ; $i++)
{ @io_channel = ();
# since we called this script 'bug', the line below
# will produce output on both, STDOUT and STDERR ('xxx' doesn't
# exist).
&system_redirect(\@io_channel,"/bin/ls -l bug xxx");
print "STDOUT was: ",$io_channel[1],"\n";
print "STDERR was: ",$io_channel[2],"\n";
}
###############################################################################
# system_redirect()
###############################################################################
sub system_redirect()
{ my($ra_io_channel,@cmd) = @_;
local $exitstatus = '?';
my $pid = open3(*CMD_IN,*CMD_OUT,*CMD_ERR,@cmd);
$SIG{CHLD} = sub
{ if(waitpid($pid,0) > 0)
{ printf("exitstatus of child: %d\n.",$?);
}
$exitstatus = $?;
};
if(defined $ra_io_channel->[0])
{ print CMD_IN $ra_io_channel->[0];
}
close(CMD_IN);
my $selector = IO::Select->new();
$selector->add(*CMD_ERR,*CMD_OUT);
while(@ready = $selector->can_read)
{ foreach $filehandle (@ready)
{ if(fileno($filehandle) == fileno(CMD_ERR))
{ $ra_io_channel->[2] .= <CMD_ERR>;
}
else
{ $ra_io_channel->[1] .= <CMD_OUT>;
}
if(eof($filehandle))
{ $selector->remove($filehandle);
}
}
}
close(CMD_OUT);
close(CMD_ERR);
return($exitstatus);
}
__END__
#
# The code above, when run in a loop on the commandline (bourne-shell or bash)
# like this (remember, the script was called 'bug' here):
i=0 ; while [ $i -lt 50 ] ; do ./bug | grep STDERR | wc ; i=`expr $i + 1`
;done
#
# produces, for example, the following output:
#
500 4500 26000
99 891 5148
Segmentation fault
500 4500 26000
500 4500 26000
500 4500 26000
500 4500 26000
500 4500 26000
500 4500 26000
211 1899 10972
Segmentation fault
434 3906 22568
26 234 1352
Segmentation fault
48 432 2496
Segmentation fault
Use of uninitialized value in scalar assignment at ./bug line 30.
Use of uninitialized value in scalar assignment at ./bug line 30.
Unable to create sub named "" at ./bug line 30.
274 2466 14248
500 4500 26000
Attempt to free unreferenced scalar at ./bug line 30.
289 2601 15028
Segmentation fault
500 4500 26000
----------->8--- cut here ----8<--------
The example output was generated with perl 5.6.1 under Linux 2.2.14, but
this was the *best* constellation we could find until now. The machine
is a Pentium III 650 MHz with 128 MB RAM and is otherwise running without
any errors. Besides, we got almost the same behaviour on all systems we
tested (see above). So this can't be neither a CPU- nor machine- nor OS-
depending bug, but must be something that lurks somewhere deep in perl.
We can only guess that the open3-child dies before the anonymus sub can
catch the signal. And, I want to mention it again, the rate of errors
is increasing dramatically when the systems load is increased. (This
points towards a race condition somewhere in between "open" and "waitpid".)
The script above may run quite fine on the above mentioned PIII system,
but as soon as I move the mouse or open another xterm, the rate of
segfaults rises dramatically. Just try on your own.
We are very sad if we can't use IPC:Open3 because of this, but it is
currently absolutely unreliable and thus unacceptable for sysadmin
tasks. We will also have severe troubles in finding something more
reliable, since all we could do, is only re-implement IPC::Open3.
I guess Tom and Nathan will have a high interest in fixing this, because
the base for it is published in their Cookbook (which is otherwise
excellent!) and everybody and his uncle may run in the same problem
we did.
Do you know of this already? Is this something that can be found in
an FAQ (I guess not, otherwise you wouldn't have published the basic
code for this in the Cookbook...)?
Please inform us when you plan to fix it (if at all) and I hope
you inform us when you fixed it. BTW we are willing to serve
as beta-testers for this.
Best regards,
Herbert
PS: Please send special greetings to Tom Christiansen, whom I had the
luck to meet in person during a perl training he held a couple of years
ago in Munich (about 1996). :) I'm the guy who worked for TSR here in
germany, too. (I guess he can't remember, but that's no blame for him... :) )
--
Herbert Wengatz Phone MchP: +49 (0)89 / 636 - 47677
I&S IT PS 8 Phone MchH: +49 (0)89 / 722 - 49296
Siemens AG Mobile : +49 (0)160 / 8 85 16 85
Otto Hahn Ring 6 Fax MchP: +49 (0)89 / 636 - 47586
81738 Muenchen mailto:herbert.wengatz@mchr2.siemens.de
http://www.mvn-services.com
-----------------------------------------------------------------
---
Flags:
category=core
severity=critical
---
Site configuration information for perl v5.6.1:
Configured by hwe at Thu Jun 21 10:08:59 MEST 2001.
Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration:
Platform:
osname=linux, osvers=2.2.14, archname=i686-linux
uname='linux elrond 2.2.14 #3 mon jan 29 13:47:05 cet 2001 i686 unknown '
config_args='-de'
hint=recommended, useposix=true, d_sigaction=define
usethreads=undef use5005threads=undef useithreads=undef
usemultiplicity=undef
useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
Compiler:
cc='cc', ccflags ='-fno-strict-aliasing -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O2',
cppflags='-fno-strict-aliasing -I/usr/local/include'
ccversion='', gccversion='2.95.2 19991024 (release)', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=4, usemymalloc=n, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lnsl -lndbm -lgdbm -ldbm -ldb -ldl -lm -lc -lposix -lcrypt -lutil
perllibs=-lnsl -ldl -lm -lc -lposix -lcrypt -lutil
libc=, so=so, useshrplib=false, libperl=libperl.a
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'
Locally applied patches:
---
@INC for perl v5.6.1:
/usr/local/lib/perl5/5.6.1/i686-linux
/usr/local/lib/perl5/5.6.1
/usr/local/lib/perl5/site_perl/5.6.1/i686-linux
/usr/local/lib/perl5/site_perl/5.6.1
/usr/local/lib/perl5/site_perl
.
---
Environment for perl v5.6.1:
HOME=/home/hwe
LANG=de_DE
LANGUAGE (unset)
LC_COLLATE=POSIX
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/home/hwe/bin:/usr/local/bin:/usr/bin/mh:/opt/kde/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin
PERL_BADLANG (unset)
SHELL=/bin/bash
-
[ID 20010629.002] perl segfaults unpredictable with valid code(Cookbook:p570,571 cmd3sel) concerns presumably a race condition between'waitpid' and 'open' and signalhandling
by Wengatz Herbert