Jon Christensen wrote:
> I have been using perl for years now and have always been a
> supporter. However, a recent experience with PERL 5.8.0 on RedHat
> LINUX 3.0AS has left me speachless. The documentation (and I
> have read a lot of it) for the read() command explicitly states
> that read reads length bytes from a filehandle. So to read one
> byte you would:
>
> read(FILEHANDLE,$buf,1)
>
> However, in some cases this will actually read multiple bytes.
You're probably using an utf-8 locale, which turns on utf8 handling
on input by default with 5.8.0.
More recent versions of perlfunc for read() state :
Note the characters: depending on the status of the filehandle,
either (8-bit) bytes or characters are read. By default all
filehandles operate on bytes, but for example if the filehandle
has been opened with the ":utf8" I/O layer (see "open", and the
"open" pragma, open), the I/O will operate on UTF-8 encoded
Unicode characters, not bytes. Similarly for the ":encoding"
pragma: in that case pretty much any characters can be read.