develooper Front page | perl.perl5.porters | Postings from January 2015

[perl #122199] perl5.18 segfault after wait() when system don't have much free memory

From:
Mark Martinec via RT
Date:
January 26, 2015 17:06
Subject:
[perl #122199] perl5.18 segfault after wait() when system don't have much free memory
Message ID:
rt-4.0.18-20834-1422291998-29.122199-15-0@perl.org
After upgrading to FreeBSD 10.1 (from 10.0) and running the same application
with the same version of Perl for two months now, with child process periodic
retiring and re-spawning new child process by a master process as previously
under FreeBSD 9.x, I can now confirm that the problem no longer occurs.

I can also confirm that the problem under 10.0 can be avoided by
not letting child processes to voluntarily exit, so the master process
never sees a child termination in wait() and never needs to spawn (fork)
another child process.

A brief summary of the problem:

Setup: an application consisting of a master perl process spawning worker
child processes, which periodically voluntarily self-terminate, to be
replaced by a fresh child process forked from the master process.

Environent:
- occurs only on FreeBSD 10.0 amd64, any recent version of perl, gcc or clang.
- does not occur on FreeBSD 9.x or 10.1, and not on i383, not reproducible
  on Linux

What seems to be happening:
- a child process after doing some work (possibly touching swap)
  does a normal exit;
- a parent process gets a SIGCHLD signal, handles a wait() and
  for some obscure reason some of its memory gets corrupted;
- a parent process forks creating a new worker child process,
  which inherits corrupted sections of parent's memory,
  consequently later leading to its (child) crash if it happens
  to use that part of the memory (opcodes or data structures)
  during its normal work. Any newly born child process inherits
  the same memory corruption and crashes alike.

So it seems the problem is somehow connected with how FreeBSD 10.0
on amd64 manages virtual memory (fork, exit, wait, possibly
involving swap). The problem is apparently fixed in 10.1, and
not present in 9.x. Does anybody have a sound explanation?

---
via perlbug:  queue: perl5 status: open
https://rt.perl.org/Ticket/Display.html?id=122199



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About