Front page | perl.perl5.porters |
Postings from July 2001
Re: [ID 20010703.002] [5.6.2-to-be] PerlIO is EVIL
Thread Previous
|
Thread Next
From:
Nick Ing-Simmons
Date:
July 6, 2001 05:39
Subject:
Re: [ID 20010703.002] [5.6.2-to-be] PerlIO is EVIL
Message ID:
20010706132827.1279.4@bactrian.ni-s.u-net.com
Ilya Zakharevich <ilya@math.ohio-state.edu> writes:
>This is a bug report for perl from ilya@ia-ia.uucp,
>generated with the help of perlbug 1.33 running under perl v5.7.1.
>
> #!perl -w
>
> open F, "$^X -wle 'binmode STDOUT; print q()' |" or die;
> my $c = getc F or warn;
> $c eq "\n" or warn ord $c;
> close F or die;
>
>is silent on usual perl, and warns "255" on 5.6.2-to-be. No, I have
>no idea why it is 255, I constructd this example to demonstrate a
>*different* bug of PerlIO.
>
>The *actual* problem is that char-by-char input requires DUPLICATE pressing
>of ENTER key for this key to be seen by Perl. Debugging this problem
>(via Term::ReadKey test suite) shows the following logic:
>
> pp_getc() calls is_eof() which does getc/ungetc
> calls getc()
>
> [BTW, I see no logic in this sequence of events.]
It is legacy scheme - for systems where we cannot snoop the buffer
we ask (e.g.) stdio to get a char, if it succeeds we are not at EOF.
But then we need to put char back.
>
>The problem is that ungetc() can't unget "\n" if this \n is the first char
>in the buffer, and quietly drops "\n" to the floor. The (horrible) code
>looks like this:
>
> if (ch == '\n')
> {
> if (b->ptr - 2 >= b->buf)
> {
>
>Tracing through perlio.c shows how much EVIL it is. All 80K of it.
If the handle is in text mode then '\n' should have come from a CRLF pair,
so there should be room for it.
>I
>simply have no words for this code. Extremely inefficient and bloated,
>doing enormous amount of work for trivial tasks - not even mentioning
>how buggy it turns out to be! Apparently the principal task is to have
>CRNL translation, and the principal usage is on Dosish systems - which
>*already* do this with binary().
All the handles should be binary() at the OS level. The reason for crlf
layer is that various CRLF->'\n' logic in C runtime on Win32 is
even worse than what I have in perlio.c.
If OS2's CRLF logic is solid an efficent then it may make sense to
NOT use crlf layer but just use OS2's read/write directly.
>
>In the example of Term::ReadKey, CRTL *knows* that TTYs should be opened
>in the binary mode. PerlIO does not.
So get Term::ReadKey to tell PerlIO it should be binary.
>There are miriads of such special
>cases.
>
>This code *duplicates* the work already done in the CRTL, thus lead to
>double translations (as easy examples show).
That is NOT the intent. The intent is that crlf layer is used when:
A. The OS CRTL does not do that (e.g. reading DOSISH files on UNIX).
B. When the CRTL is buggy and inefficient (e.g. Borland and VC++ C on Win32).
>Not even mentioning that
>this code is probably 1/10 as efficient as the code in CRTL (and 5 times
>bulkier and sub-call-deeper) - in*****tent is the only word I could find
>for perlio.c.
>
>The task of CRNL translation is very quick and easy - with a proper
>implementation.
The main pain is getting tell() and seek() to work right.
>No need to extra copying, no nothing: just keep an extra
>pointer into the buffer. Whatever is before the pointer is translated,
>whatever is after is not - and the char immediately after the pointer
>should be ignored. On _fill() move the pointer to "\n" in the nearest
>"\r\n" pair, and replace the first "\r" by "\n". That's all. ungetc()
>modifies only the already-translated part, there is no need to untranslate
>things (which is impossible anyway).
I agree it is "impossible" - but there is a _need_. If the handle was in
text mode and then switched back to binary mode then the CRLF need to be
"put back". (E.g. a CRLF MIME header on a file which then switches to embedded
JPEG or whatever.).
If the stream is seek-able then safe approach is to seek and re-read.
>Similar things may be done for all
>the other needed translations.
Patches welcome. I wrote the CRLF layer very quickly.
But crlf layer does not copy either, we use the CRLF layer as the buffer
layer.
>
>[It is also very efficient to report only the part until this buffer as
>available, if it is available. This allows simple inlining of getc().]
>
>I do not even want to mention that in many cases all Perl needs is a
>pointer to the buffer, but PerlIO insists on copying the stuff...
PerlIO does _NOT_ insist on copying - that is exactly the sort of thing
the Win32 CRTL do which I am trying to avoid.
There may be the odd spot where implementation is not optimal.
>
>I do not think that PerlIO is ready for 5.6.2.
>
>Hope this helps,
>Ilya
>
>---
>Flags:
> category=core
> severity=critical
>---
>Site configuration information for perl v5.7.1:
>
>Configured by vera at Mon Jun 25 01:33:44 PDT 2001.
>
>Summary of my perl5 (revision 5.0 version 7 subversion 17) configuration:
> Platform:
> osname=os2, osvers=2.30, archname=os2
> uname='os2 ia-ia 2 2.30 i386 '
> config_args='-des -Dprefix=i:/perllib -Dusedevel'
> hint=recommended, useposix=true, d_sigaction=define
> usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
> useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
> use64bitint=undef use64bitall=undef uselongdouble=undef
> Compiler:
> cc='gcc', ccflags ='-Zomf -Zmt -DDOSISH -DOS2=2 -DEMBED -I. -D_EMX_CRT_REV_=63 -Wall',
> optimize='-O2 -fomit-frame-pointer -malign-loops=2 -malign-jumps=2 -malign-functions=2 -s',
> cppflags='-Zomf -Zmt -DDOSISH -DOS2=2 -DEMBED -I. -D_EMX_CRT_REV_=63 -Wall'
> ccversion='', gccversion='2.8.1', gccosandvers=''
> intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
> d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
> ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=4
> alignbytes=4, usemymalloc=y, prototype=define
> Linker and Libraries:
> ld='gcc', ldflags ='-Zexe -Zomf -Zmt -Zcrtdll -Zstack 32000 -Zlinker /e:2'
> libpth=i:/emx.add/lib i:/emx/lib D:/DEVTOOLS/OPENGL/LIB I:/JAVA11/LIB i:/emx/lib/mt
> libs=-lsocket -lm -lbsd
> perllibs=-lsocket -lm -lbsd
> libc=i:/emx/lib/mt/c_import.lib, so=dll, useshrplib=true, libperl=libperl.lib
> Dynamic Linking:
> dlsrc=dl_dlopen.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
> cccdlflags='-Zdll', lddlflags='-Zdll -Zomf -Zmt -Zcrtdll -Zlinker /e:2'
>
>Locally applied patches:
> DEVEL10906
>
>---
>@INC for perl v5.7.1:
> lib/os2
> lib
> i:/perllib/lib/5.7.1/os2
> i:/perllib/lib/5.7.1
> i:/perllib/lib/site_perl/5.7.1/os2
> i:/perllib/lib/site_perl/5.7.1
> i:/perllib/lib/site_perl/5.00553/os2
> i:/perllib/lib/site_perl/5.00553
> i:/perllib/lib/site_perl
> .
>
>---
>Environment for perl v5.7.1:
> HOME=j:/home
> LANG=EN_US
> LANGUAGE (unset)
> LD_LIBRARY_PATH (unset)
> LOGDIR (unset)
> PATH=i:\NETSCAPE\PROGRAM;d:\OPENDOC\EPM;d:\OPENDOC\OREXX;d:\OPENDOC\BIN;i:\DRV;i:\TOOLKIT\BETA\BIN\PMEISTER;i:\TOOLKIT\BETA\BIN;i:\TOOLKIT\BIN\OSARED;i:\TOOLKIT\SOM\COMMON;i:\TOOLKIT\SOM\BIN;i:\TOOLKIT\BIN;D:\OS2;D:\OS2\SYSTEM;i:\VABASIC\SOM\BIN;D:\OS2\INSTALL;D:\;D:\OS2\MDOS;D:\OS2\APPS;i:\BIN;i:\UTILS;i:\EMACS\19.33\BIN;i:\emx.add\BIN;i:\EMX\BIN;F:\PBM;i:\EMTEX\bin;i:\ckermit;I:\PAGETURN;d:\tcpip\bin;D:\TCPIP\UMAIL;D:\VIEWER\BIN;I:\XFree86\bin;i:\VABASIC\BIN;i:\SMART;I:\Java11\rmi-iiop\bin;I:\JAVA11\BIN;d:\JAVAOS2\bin;D:\MMOS2;I:\JAVA11\ICATJAVA\BIN;I:\gs\tools;I:\gs\gs7.00\bin;
> PERLLIB_PREFIX=f:/perllib;i:/perllib
> PERL_BADLANG (unset)
> PERL_SH_DIR=i:/bin
> SHELL (unset)
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
Thread Previous
|
Thread Next