develooper Front page | perl.perl5.porters | Postings from July 2001

[ID 20010703.002] [5.6.2-to-be] PerlIO is EVIL

Thread Next
From:
Ilya Zakharevich
Date:
July 3, 2001 00:20
Subject:
[ID 20010703.002] [5.6.2-to-be] PerlIO is EVIL
Message ID:
20010703031957.A15798@math.ohio-state.edu
This is a bug report for perl from ilya@ia-ia.uucp,
generated with the help of perlbug 1.33 running under perl v5.7.1.

  #!perl -w

  open F, "$^X -wle 'binmode STDOUT; print q()' |" or die;
  my $c = getc F or warn;
  $c eq "\n" or warn ord $c;
  close F or die;

is silent on usual perl, and warns "255" on 5.6.2-to-be.  No, I have
no idea why it is 255, I constructd this example to demonstrate a
*different* bug of PerlIO.

The *actual* problem is that char-by-char input requires DUPLICATE pressing
of ENTER key for this key to be seen by Perl.  Debugging this problem
(via Term::ReadKey test suite) shows the following logic:

  pp_getc() calls  is_eof() which does getc/ungetc
	    calls  getc()

  [BTW, I see no logic in this sequence of events.]

The problem is that ungetc() can't unget "\n" if this \n is the first char
in the buffer, and quietly drops "\n" to the floor.  The (horrible) code
looks like this:

       if (ch == '\n')
        {
         if (b->ptr - 2 >= b->buf)
          {

Tracing through perlio.c shows how much EVIL it is.  All 80K of it.  I 
simply have no words for this code.  Extremely inefficient and bloated,
doing enormous amount of work for trivial tasks - not even mentioning
how buggy it turns out to be!   Apparently the principal task is to have
CRNL translation, and the principal usage is on Dosish systems - which
*already* do this with binary().

In the example of Term::ReadKey, CRTL *knows* that TTYs should be opened
in the binary mode.  PerlIO does not.  There are miriads of such special
cases.

This code *duplicates* the work already done in the CRTL, thus lead to
double translations (as easy examples show).  Not even mentioning that
this code is probably 1/10 as efficient as the code in CRTL (and 5 times
bulkier and sub-call-deeper) - in*****tent is the only word I could find
for perlio.c.

The task of CRNL translation is very quick and easy - with a proper
implementation.  No need to extra copying, no nothing: just keep an extra
pointer into the buffer.  Whatever is before the pointer is translated,
whatever is after is not - and the char immediately after the pointer
should be ignored.  On _fill() move the pointer to "\n" in the nearest
"\r\n" pair, and replace the first "\r" by "\n".  That's all.  ungetc()
modifies only the already-translated part, there is no need to untranslate
things (which is impossible anyway).  Similar things may be done for all
the other needed translations.

[It is also very efficient to report only the part until this buffer as
available, if it is available.  This allows simple inlining of getc().]

I do not even want to mention that in many cases all Perl needs is a
pointer to the buffer, but PerlIO insists on copying the stuff...

I do not think that PerlIO is ready for 5.6.2.

Hope this helps,
Ilya

---
Flags:
    category=core
    severity=critical
---
Site configuration information for perl v5.7.1:

Configured by vera at Mon Jun 25 01:33:44 PDT 2001.

Summary of my perl5 (revision 5.0 version 7 subversion 17) configuration:
  Platform:
    osname=os2, osvers=2.30, archname=os2
    uname='os2 ia-ia 2 2.30 i386 '
    config_args='-des -Dprefix=i:/perllib -Dusedevel'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
  Compiler:
    cc='gcc', ccflags ='-Zomf -Zmt -DDOSISH -DOS2=2 -DEMBED -I. -D_EMX_CRT_REV_=63 -Wall',
    optimize='-O2 -fomit-frame-pointer -malign-loops=2 -malign-jumps=2 -malign-functions=2 -s',
    cppflags='-Zomf -Zmt -DDOSISH -DOS2=2 -DEMBED -I. -D_EMX_CRT_REV_=63 -Wall'
    ccversion='', gccversion='2.8.1', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=4
    alignbytes=4, usemymalloc=y, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags ='-Zexe -Zomf -Zmt -Zcrtdll -Zstack 32000 -Zlinker /e:2'
    libpth=i:/emx.add/lib i:/emx/lib D:/DEVTOOLS/OPENGL/LIB I:/JAVA11/LIB i:/emx/lib/mt
    libs=-lsocket -lm -lbsd
    perllibs=-lsocket -lm -lbsd
    libc=i:/emx/lib/mt/c_import.lib, so=dll, useshrplib=true, libperl=libperl.lib
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
    cccdlflags='-Zdll', lddlflags='-Zdll -Zomf -Zmt -Zcrtdll -Zlinker /e:2'

Locally applied patches:
    DEVEL10906

---
@INC for perl v5.7.1:
    lib/os2
    lib
    i:/perllib/lib/5.7.1/os2
    i:/perllib/lib/5.7.1
    i:/perllib/lib/site_perl/5.7.1/os2
    i:/perllib/lib/site_perl/5.7.1
    i:/perllib/lib/site_perl/5.00553/os2
    i:/perllib/lib/site_perl/5.00553
    i:/perllib/lib/site_perl
    .

---
Environment for perl v5.7.1:
    HOME=j:/home
    LANG=EN_US
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=i:\NETSCAPE\PROGRAM;d:\OPENDOC\EPM;d:\OPENDOC\OREXX;d:\OPENDOC\BIN;i:\DRV;i:\TOOLKIT\BETA\BIN\PMEISTER;i:\TOOLKIT\BETA\BIN;i:\TOOLKIT\BIN\OSARED;i:\TOOLKIT\SOM\COMMON;i:\TOOLKIT\SOM\BIN;i:\TOOLKIT\BIN;D:\OS2;D:\OS2\SYSTEM;i:\VABASIC\SOM\BIN;D:\OS2\INSTALL;D:\;D:\OS2\MDOS;D:\OS2\APPS;i:\BIN;i:\UTILS;i:\EMACS\19.33\BIN;i:\emx.add\BIN;i:\EMX\BIN;F:\PBM;i:\EMTEX\bin;i:\ckermit;I:\PAGETURN;d:\tcpip\bin;D:\TCPIP\UMAIL;D:\VIEWER\BIN;I:\XFree86\bin;i:\VABASIC\BIN;i:\SMART;I:\Java11\rmi-iiop\bin;I:\JAVA11\BIN;d:\JAVAOS2\bin;D:\MMOS2;I:\JAVA11\ICATJAVA\BIN;I:\gs\tools;I:\gs\gs7.00\bin;
    PERLLIB_PREFIX=f:/perllib;i:/perllib
    PERL_BADLANG (unset)
    PERL_SH_DIR=i:/bin
    SHELL (unset)


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About