develooper Front page | perl.perl5.porters | Postings from July 2002

Re: Thread::Use - use a module inside a thread only

From:
Nicholas Clark
Date:
July 30, 2002 06:10
Subject:
Re: Thread::Use - use a module inside a thread only
Message ID:
20020730141023.J38407@plum.flirble.org
On Tue, Jul 30, 2002 at 12:30:24PM +0200, Elizabeth Mattijsen wrote:
> At 11:22 AM 7/30/02 +0100, Nicholas Clark wrote:
> > > Hmmm... do we _really_ want to be reading or writing gzipped files from
> > > different threads?  I think that would really be asking for trouble...
> >Oh, reading or writing should "break" (in that your data gets corrupted by
> >being randomly split into the two programs, or randomly spliced together
> >on disk from the two programs) but it should not SEGV (or break your program
> >otherwise with double free()s)
> 
> Ah... ok, glad we agree on that...  ;-)

It's just that I can't see an easy way to deliver that guarantee of lack of
SEGV.

> >... providing one side of the fork doesn't
> >touch the stream further then all is happy. (all that side does is close and
> >cleanly discard its z_stream structure at some point) If both sides of the
> >fork access the stream then their zlib structures in memory are not
> >corrupted, but garbage data will result.

And I believe that the structure used in the new thread needs to be doing
all its malloc()ing from the new thread's malloc. So a copy of the
structure is needed (into pools from the new malloc) rather than just
"stealing" the existing malloc()ed memory from the parent thread

> As I said, as a simple solution for now, I think a:
> 
>    sub CLONE { undef( %PerlIO::gzip:: ) }
> 
> or equivalent of that in XS code would be sufficient to get around the 
> current problem of having to start your threads before you can open a 
> PerlIO::gzip layer...

I believe that this won't work. Nor the direct XS equivalent. It will hide
the problem:

Once an IO layer is created, it is a free standing object attached to a
stream. It doesn't matter if you "nuke" the module in the symbol table -
you still have existing PerlIO structures pushed on existing streams that
don't get killed this way.
And the PerlIO system defaults to doing a shallow structure copy (much like
the default in C++ for copying objects) to make a copy of every layer for
the new chain of layers in the layer stack. And to clone safely PerlIO::gzip
needs to implement a deep copying constructor. And zlib doesn't provide
one for inflate streams (I've checked - deflateCopy copies the internal state
for deflation streams, and the internal state differs)

So a zero SEGV annoying solution would appear to be to track every
PerlIO::gzip layer created (per thread), and zap the lot at clone time
(for that thread). And that seems quite a bit of work in itself, to provide
an interface that lets you have PerlIO::gzip loaded once but unable to
leave layers stacked over a CLONE.

Nicholas Clark



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About