develooper Front page | perl.perl5.porters | Postings from May 2003

Re: Meaning of sysread()

Thread Previous
From:
perl5-porters
Date:
May 25, 2003 10:10
Subject:
Re: Meaning of sysread()
Message ID:
baqtem$6c4$1@post.home.lunix
In article <20030525001049.GB20923@verizon.net>,
	Kurt Starsinic <kstar@cpan.org> writes:
> 
>     Just once more, I'd like to assert that sysread() should do
> whatever C's read() does, however weird or broken that may or may
> not be on any given platform.  Its only usefulness that I can see
> is to "do a read that works the same as in my C program".
> 

I disagree actually. I think the primary raison d'etre for sysread()
is to write socket programs (or a tty or any other stream without
guaranteed data availability). If you just sysread from such a stream
to get some data when it becomes available, there is no real problem
with any of the interpretations. But a very important use is select/poll
which means that if select says it's readable, from a sysread you want 
either:
  1) an error
  2) a zero length read indicating EOF
  3) >=1 which means actual data.

Now suppose select() triggered and only a single CR is available. 
What should sysread do ?
 - return 0 which will be misinterpreted as EOF -> unacceptable
 - return 1 which will cause it to see a single CR, while in fact
   a LF may still be on its way -> unacceptable
 - block until the situation clears which is unacceptable in a 
   select/poll program
 - return -1 with errno EAGAIN and squirrel the CR into a buffer,
   and then fake the next sysread -> ugly but acceptable

The same sort of argument of course goes for UTF8 on any OS.
Something must be decided for a multi-byte character where only
some of the bytes arrived yet.

But this style of socket programming is just too important and ought
to work out of the box.

So either:
  - just define sysread() to have byte semantics
  - in perlio (or using libc) play the errno=EAGAIN game for CRLF
    and UTF8 (but keep byte semantics available for binmode handles)

If either of these actually describes what libc read() does, fine. But
I think it's more important to have the above semantics than a guarantee
that perl sysread is libc read (if they differ, they could be added to
POSIX)

(I don't actually care which of these styles is chosen as long as 
the default socket sysread() has byte semantics (for backward 
compatibility on unix systems))

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About