develooper Front page | perl.perl5.porters | Postings from August 2013

[perl #118059] race condition+fail in dist\IO\t\cachepropagate-tcp.t

Thread Previous | Thread Next
From:
Tony Cook via RT
Date:
August 29, 2013 01:09
Subject:
[perl #118059] race condition+fail in dist\IO\t\cachepropagate-tcp.t
Message ID:
rt-3.6.HEAD-1873-1377738572-133.118059-15-0@perl.org
On Tue Aug 27 16:41:11 2013, tonyc wrote:
> On Thu Jun 27 23:50:22 2013, tonyc wrote:
> > It smoked ok, but I don't trust that it fixes the underlying problem
> > (which may not be fixable, leaving us with the work-around.)
> > 
> > My theory above, as written, is hopefully nonsense - Windows returning
> > a socket and suddenly making it not-a-socket, rather than return an
> > end-of-file or EPIPE on the next operation would be even more broken
> > than I expect from Microsoft.
> > 
> > For a Realâ„¢ fork() any open file handles or sockets are cloned in the
> > child - the child can exit or explicitly cose their socket handle, but
> > it won't have an effect on the socket handle the parent has.
> > 
> > Under Win32 we emulate that, which I suspect is the real cause of the
> > problem here - when a thread is created fp_dup() calls
> > PerlIO_fdupopen() which does reasonable things on Unix, but on Win32
> > that leaves all the work to win32_fdupopen().
> > 
> > win32_fdupopen() calls win32_dup() a trivial wrapper around dup() -
> > and I don't see how that works for a socket fd unless the CRT dup() is
> > tolerant of errors from DuplicateHandle().
> 
> This theory turned out to be nonsense, I'm exploring the behaviour some
> more.

Amongst many other things I tried, I changed win32_accept to:

win32_accept(SOCKET s, struct sockaddr *addr, int *addrlen)
{
  SOCKET r, s2;
  SOCKET x;

    SOCKET_TEST((r = accept(TO_SOCKET(s), addr, addrlen)), INVALID_SOCKET);
    if (r == INVALID_SOCKET) {
	  dTHX;
	  PerlIO_printf(PerlIO_stderr(), "accept(%d (%p)) => %d failed %d\n",
(int)s, _get_osfhandle(s), (int)r, errno);
    }
        s2 = OPEN_SOCKET(r);
	x = _get_osfhandle(s2);
	if (x != r) {
	  dTHX;
	  PerlIO_printf(PerlIO_stderr(), "accept(%d (%d)) => %d but osfhandle
returned bad %d\n", (int)s, _get_osfhandle(s), (int)r, (int)x);
	}
    return s2;
}

The output from a failed run looked like:

1..8
ok 1 - socket created
ok 2 - protocol defined
ok 3 - domain defined
ok 4 - type defined
ok 5 - spawned a child
accept(3 (112)) => 236 but osfhandle returned bad -1
Can't use an undefined value as a symbol reference at
../dist/IO/t/cachepropagate-tcp.t line 49.
# Looks like you planned 8 tests but ran 5.
# Looks like your test exited with 9 just after 5.

So it seems some race is closing the accept()ed socket before we get to
use it.

Pretty much any output I do in the child or parent before the accept()
prevents the problem from occuring for me, which makes it difficult to test.

Tony

---
via perlbug:  queue: perl5 status: open
https://rt.perl.org:443/rt3/Ticket/Display.html?id=118059

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About