develooper Front page | perl.perl5.porters | Postings from May 2003

Re: Meaning of sysread()

Thread Previous | Thread Next
From:
Mark Mielke
Date:
May 21, 2003 08:36
Subject:
Re: Meaning of sysread()
Message ID:
20030521154223.GA25674@mark.mielke.cc
On Wed, May 21, 2003 at 03:45:39PM +0100, Nick Ing-Simmons wrote:
> Mark Mielke <mark@mark.mielke.cc> writes:
> >I expect Perl sysread() to call C read() without any buffering, which
> >therefore means the level at which the system call returns
> >units. Specifically I don't want select() to ever block where
> >sysread() would return data.
> That way round is reasonably likely to work.
> What the characters semantic might cause is the converse: select()
> says it is readable but sysread() blocks. For example we only have 
> the (part-of?) escape-sequence that says "following characters are ASCII" 
> but no characters yet - and we don't want to return 0 because that means EOF 
> and it isn't.

Here is another argument:

sysread() is documented ('man perlfunc') to '[attempt] to read LENGTH
characters of data ... [bypassing] buffered IO ...'. If Perl sysread()
is actually implemented using C read(), then it would be impossible to
read LENGTH characters of data, and only LENGTH characters of data
without calling C read() once for each byte. Remember - no buffering.
You can't put the bytes back. Therefore, the documentation is wrong,
or sysread() is implemented extremely inefficiently.

Since it is doubtful that people would accept inefficient behaviour on
the part of sysread(), it seems only valid for the documentation to be
corrected. The documentation should read:

    Attempts to read LENGTH _bytes_ of data ...

As bytes is the only quantifiable value that can be passed straight
into the kernel.

To support legacy applications, a successful sysread(4096) that reads
all 4096 bytes should return 4096, which means that the return value
should also be in bytes (the same as C read()). Again to support
legacy applications, a successfuly sysread(4096) should insert 4096
new units into SCALAR meaning that SCALAR should hold bytes.

For Perl read(), we don't care, because read() is buffered and is not
affected by the same performance issues mentioned earlier. Reading one
byte at a time is not expensive for buffered PerlIO handles.

To validate this entire line of thinking I suggest that the following
point be seriously considered:

   - sysread() is supposed to be a more direct system read that avoids
     the intermediate layers of processing. This definately includes STDIO,
     and as far as I am concerned, it definatesly includes filtering.

> Devious work-rounds like failing with EAGAIN might work round this...

*shudder*

Cheers,
mark

-- 
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About