develooper Front page | perl.perl5.porters | Postings from November 2014

Re: [perl #122199] perl5.18 segfault after wait() when system don'thave much free memory

Thread Next
From:
Mark Martinec
Date:
November 3, 2014 23:30
Subject:
Re: [perl #122199] perl5.18 segfault after wait() when system don'thave much free memory
Message ID:
ac5288738b7dc07b05c561bc4f9c9b8c@mailbox.ijs.si
> On Sat, Oct 18, 2014 at 03:39:17AM +1100, L.O.Sim wrote:
>> In the past, my application will run for 4 or more hours before core 
>> dumps
>> or funny perl internal errors start to happen. Then I managed to 
>> reduce it
>> to a small script that high-lights the problem.

2014-10-20 12:38, Dave Mitchell via RT wrote:
> Can you (or someone else with the ability to reproduce this) please run
> this on a perl (the newest version you have access to), built with
> debugging symbols (and preferably -DDEBUGGING), then provide a C-level
> backtrace? Also, provide the 'perl -V' output for the perl you use.
> 
> Also if possible, run it under valgrind, or clang with Address 
> Sanitizer,
> and show what errors it reports.

Thanks Dave for your interest and concern in this matter.
Unfortunately it seems to be quite hard to reproduce the failure at 
will.
Have been trying the sample code as provided by L.O.Sim here, and some
of its variations (including using Net::Server::PreFork, which is how
my application is dealing with handling of child processes), yet I can't
make it fail here, on the same hw, compiler and perl as my production
setup runs which is crashing every once in a while, e.g. once every day
or two or three.

Meanwhile our mail filtering application had one particularly 
unfortunate
incident which (of course) started in the middle of the night and went
on onto a Saturday noon. Hence I took the advice of L.Sim (avoiding a
parent process to regularly spawn new child processes) and configured
child processes never to retire (not to exit voluntary, to be replaced
by a new child process). So instead of having 50 child processes, each
of which retiring after processing 20 mail messages, I now have 50 child
processes which never retire, and handle thousands of requests. After
several days (150.000 messages processed by the same set of 50 
processes),
the setup is still alive and well, no crashes. Still under perl 5.20.1
with -DDEBUGGING, perl and modules compiled with gcc 
-fstack-protector-all.

This only confirms that the problem does not originate from some xs code
in some module, but is directly related to a fork/exit/wait sequence,
leading eventually to a corruption in a parent process.

The few experiments that I have run under valgrind did not result
in a crash and neither did they detect any notable problem.
Unfortunately running under valgrind is too slow for regular mail
processing. I don't think our version of clang (3.3) already supports
Address Sanitizer, need to take another look. Still, MALLOC checks
and gcc -fstack-protector-all should have covered certain types of
memory overruns, yet none was detected. I'm beginning to suspect virtual
memory handling under FreeBSD 10.0, although if there were any problem
there it's likely that some other application would have stumbled across
that too and be reported in the last six months.

   Mark



Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About