develooper Front page | perl.perl5.porters | Postings from March 2017

Re: [perl #32714] Objects destroyed in the wrong order during globaldestruction

Thread Previous | Thread Next
From:
demerphq
Date:
March 14, 2017 12:52
Subject:
Re: [perl #32714] Objects destroyed in the wrong order during globaldestruction
Message ID:
CANgJU+UMLh9zniQDObhNpHjXD8TSK+51sWQrcGb-VNfRnZndtA@mail.gmail.com
On 14 March 2017 at 13:40, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> On Tue, Mar 14, 2017 at 1:22 PM, demerphq <demerphq@gmail.com> wrote:
>> On 11 March 2017 at 22:16, Leon Timmermans <fawaka@gmail.com> wrote:
>>> On Sat, Mar 11, 2017 at 1:32 PM, Ævar Arnfjörð Bjarmason
>>> <avarab@gmail.com> wrote:
>>>> On a related note. At Booking.com in our web applications we don't run
>>>> destruction beyond just destroying the Plack $env, then we
>>>> POSIX::_exit(). See https://github.com/unbit/uwsgi/pull/1392/files &
>>>> https://github.com/unbit/uwsgi/issues/1384 for why, but tl;dr: With
>>>> forked processes it creates a CPU/Memory DoS attack on the kernel as
>>>> it frantically tries to unshare all the pages whose refcounts change.
>>>>
>>>> So uWSGI has an option now to simply skip the perl_destruct() &
>>>> perl_free() phases, which from my looking at the source & testing it
>>>> seemed like the sane solution:
>>>> https://github.com/avar/uwsgi/blob/bafc14c80771a2c55063c3c9cc9c9dd0377a0294/plugins/psgi/psgi_plugin.c#L860
>>>>
>>>> I can't think of a use-case for this, but is it a good idea to provide
>>>> some alternative to POSIX::_exit() in this case, i.e. is there some
>>>> subset of perl_{destruct,free}() that we could run that wouldn't free
>>>> everything, but e.g. just enough to run any DESTROY handlers?
>>>> Currently if we have some DESTROY handler hanging off anything but
>>>> $env it simply won't execute when we teardown the process.
>>>>
>>>> I don't know if that's possible or even a mad idea, but since you're
>>>> poking this code I thought I'd bring this use-case to your attention.
>>>
>>> Perl already has a concept of destruct level, current valid values are
>>> 0, 1, and 2.
>>>
>>> 0 destructs much of the interpreter and all objects, this is the
>>> default on unthreaded perls.
>>> 1 destructs a bit more of the interpreter, in particular the stashes.
>>> This is the default in an embeded perl with multiplicity enabled.
>>> 2 destructs absolutely everything, including non-object cycles and the
>>> global string table. threads.pm sets it to this value.
>>>
>>> I'm not sure what the value for PL_perl_destruct_level is in your
>>> particular case, lowering it may help. I can imagine introducing a -1
>>> level that does little more than run END blocks and flush IO buffers.
>>
>> Those models dont match the problem Booking is solving. They may match
>> a different problem but that is a different question.
>>
>> The issue here is that in a multi-forked process model you want to add
>> an extra concept: that is which process "owns" the data, and thus is
>> responsible for the cleanup of the objects involved.
>>
>> So consider a scenario where we have a mother process which owns an
>> object, lets say a string. Now we fork, and have the child process do
>> something, say write some summaries to disk. When that child process
>> terminates we don't want to it to try to free or destruct any
>> variables owned by the parent process that the child process hasn't
>> touched, in this case the string structure owned by the parent.
>>
>> What happens in a normal scenario is that the refcount page holding
>> that string is decremented during cleanup by the child, which un-COWs
>> the page.
>>
>> If you are using a fork/exec model, the costs of tearing down all the
>> structures which are effectively RO to the child process can be very
>> significant, to the point of being a "forkbomb" in disguise.
>>
>> So we don't tear down, we just kill everything. Avar would like a more
>> graceful solution that knows that there are some things that the child
>> should tear down, and that the rest should not be.
>>
>> Examples of this abound btw, for instance consider PL_strtab. Every
>> time a child process exits we free up all those keys....
>
> Everything you said makes sense & is a good summary of our situation
> and why we just short-circuit destruction via POSIX::_exit().
>
> But just to clarify: I'm not particularly looking for a graceful
> solution like this myself or for us, I just thought it was interesting
> to point out this use-case in general if we're talking about tweaks to
> how perl does destruction when embedded.

Ok, fair enough. On the other hand, if someone could offer us a
graceful solution to this we would take it and stop doing the
hard-hammer approach we do now...

> I.e. at Booking's web serving processes we're perfectly happy to just
> not run perl_{destroy,free} on teardown, and just say that you can't
> have a DESTROY handler hanging off anything but the $env variable.
> I.e. we won't run any package-level DESTROY handlers, which is a
> trade-off we're OK with, and we're about to tear down the process
> anyway so it's fine to leave the PerlInterpreter in a dirty state.
>
> But this is an interesting question for perl embedding in general.
> E.g. you might without multiplicity or multithreading have a 10GB perl
> interpreter that you only free to the extent that you can spawn
> another one, i.e. intentionally leak memory because you're going to
> make the OS take care of the cleanup eventually.
>
> Or perhaps out of those 10GB of variables/hashes/data you'd just like
> the package-level DESTROY handlers to fire, which might free an un-COW
> 500MB of those 10GB, but would cost you way less than full
> destruction.
>
> Another thing I didn't have time to look into as well is to what
> extent the memory layout could be improved to avoid these DoS
> situations, i.e. what specifically is causing the pages to COW. I
> assume  it's because we go around touching the REFCNT which (I think)
> is stored in the same pages adjacent to the rest of the SV header.
> Perhaps changing some of that around and storing just the REFCNT in
> its own pages distinct from the SV's would be a good tradeoff in some
> cases.

The SvHeader holds the recount. And sv heads are packed into pages. An
sv head holds more than just the refcount however, so we dont get the
maximum density possible, but i think its pretty dense as is.

Yves



-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About