Front page | perl.perl5.porters |
Postings from March 2000
What's left to do? [LONG]
From:
scozens
Date:
March 22, 2000 00:20
Subject:
What's left to do? [LONG]
Message ID:
492568AA.001ACB7C.00@pwj-gw-n001.pwj.co.jp
There are now three Todo files floating around. What I'd
like to do is update them all and turn them into a Grand
Unified Todo which explains fully what we want, how urgent
we can consider it, and who (if anyone) is working on it.
Let's see what we can make of the Todos out there at the
moment. I'd appreciate as many comments on this as possible
so I can put this all together into something (hopefully)
useful.
(And any analysis I make of how to fix something is
completely IMHO, and I reserve the right to be wrong. 1/2 :)
(Bugs)
! fix small memory leaks on compile-time failures
This is echoed in perltodo:
# =head2 Memory leaks from failed eval/regcomp
# The only known memory leaks in Perl are in failed code or regexp
# compilation. Fix this. Hugo Van Der Sanden will attempt this but
# won't have tuits until January 1999.
Hugo, did you have any luck tracking this down? At
the very least is there a test case for this?
(Unicode)
*sigh*. I think the most important thing in here is the
line disciplines. This is something that's been promised for a
while.
!add support for I/O disciplines
! - a way to specify disciplines when opening things:
! open(F, "<:crlf :utf16", $file)
! - a way to specify disciplines for an already opened handle:
! binmode(STDIN, ":slurp :raw")
!- a way to set default disciplines for all handle constructors:
! use open IN => ":any", OUT => ":utf8", SYS => ":utf16"
There are two issues here, really. The first is the UTF8/input
format matter, the second is the line-ending/other flags. I think
(although I've not studied it in depth) the way I'd attack input
modes would be to use IoFLAGS to set the mode for a given
filehandle, and then to set SvUTF8_on (or whatever) for anything
returned from a filehandle where, eg,
IoFLAGS(io) & IOf_UTF8
This would require a few lines of change to each function that
reads from a filehandle.
Extending binmode to set these flags as well as the current
disciplines doesn't look like it would be too hard to do.
That said, it still needs doing. Is anyone actively working on
this?
The other issue is the line ending disciplines, what constitutes
EOL. The infrastructure for this is already done - L<open> exists
and does the right thing, C<binmode> is prepared to accept more
disciplines, and it's even all documented. C<open> doesn't look
like it's ready yet, though. We then have to somehow store the
line ending with the filehandle - if we're going to be extensible,
I'd like to see not just binary flags ("slurp mode", "crlf" &c.)
but arbitary strings or - who'dathunkit - CVs in there. This
will make Perl_sv_gets into a bit of a mess, though.
Uh, more of a mess than it already is.
Again, any players on this one?
Conversions:
! finish byte <-> utf8 and localencoding <-> utf8 conversions
Where? What's outstanding?
! make substr($bytestr,0,0,$charstr) do the right conversion
Fun. Fun, fun, fun.
! eliminate need for "use utf8;"
For reference, utf8 currently claims to do the following:
Bytes in the source text that have their high-bit set will be
treated as being part of a literal UTF-8 character.
In the absence of inputs marked as UTF-8, regular expressions
within the scope of this pragma will default to using character
semantics instead of byte semantics.
I'd like to see the first being a default; would this require
any more work than just commenting out line 36 and decommenting 35
of toke.c? (There's a corresponding change in pad_allocmy too)
In any case, can we get rid of HINT_UTF8 and replace it with
(!HINT_BYTE) or are there some subtleties I've missed?
! autoload utf8_heavy.pl's swash routines in swash_init()
I don't quite understand this, I'm afraid - swashes went
way over my head.
! autoload byte.pm when byte:: is seen by the parser
Nor this - why do we want to do this?
! check uv_to_utf8() calls for buffer overflow
} Unicode collation? http://www.unicode.org/unicode/reports/tr10/
Anyone working on these?
I'm leaving the heavy work on threading and compiling as-is, because
I assume the people best placed to work on these fully understand
them anyway, and they seem in pretty good hands.
Tasks for mere mortals:
(Threading)
# Which of the standard modules are thread-safe? Which CPAN modules?
# How easy is it to fix those non-safe modules?
# Threading is still experimental. Every reproducible bug identifies
# something else for us to fix. Find and submit more of these problems.
Basically lots of destruction testing. Anyone can do this.
(Compiling)
# Figure out how and where byteperl will be built for the various
# platforms.
What do we mean by this?
# Save byte-compiled modules on disk.
Java .jars here we come. I'm envisaging a pragma which is a mixture of
the guts of ByteLoader and perlcc, and either loads or JIT compiles
modules. I don't think it would be too hard, would it?
Anybody want to play?
# Auto-produce executable.
Don't understand. Isn't this what perlcc does, or do we want to
produce true executables from bytecode? I still can't get
perlcc -b -o hello -e 'print qq/Hello world\n/'
to do anything interesting.
(API)
Namespace issues are being thrashed out in another thread and I'll
see what the outcome is there. From Todo-5.6:
! CPP-space: restrict what we export from headers when !PERL_CORE
! header-space: move into CORE/perl/?
! API-space: complete the list of things that constitute public api
From perltodo:
# symbol-space: "pl_" prefix for all global vars
# "Perl_" prefix for all functions
(This part is done now, isn't it?)
# env-space: Configure should use PERL_CONFIG instead of CONFIG etc.
(Configure)
Configure needs a pumpking, doesn't it? Blegh. Can I safely assume
that Andy is the main man for these?:
! fix the vicious cyclic multidependency of cc <-> libpth <-> loclibpth
! libswanted <-> usethreads <-> use64bitint <-> use64bitall <->
! uselargefiles <-> ...
! make configuring+building away from source directory work (VPATH et al)
! _r support
! UNIX98 support: reader-writer locks, realtime/asynchronous IO
! Configure probe for quad_t, uquad_t, and (argh) u_quad_t
! IPv6 support: see RFC2292, RFC2553
# Install HTML
This one claims to be owned by the other Andy, Andy Wardley. Yesno?
# Portable installations
I remember talking to someone who had evil designs on patching the
perl binary to remove full paths. Is there still interest in doing
this, hopefully more elegantly?
} cross-compilation support
This is huge. Absolutely mega. Anyone working on it?
(POSIX)
! POSIX 1003.1 1996 Edition support--realtime stuff:
! POSIX semaphores, message queues, shared memory, realtime clocks,
! timers, signals (the metaconfig units mostly already exist for these)
# Update the POSIX extension to conform with the POSIX 1003.1 Edition 2
! POSIX [=bar=] and [.zap.] would nice too but there's no API for them
# POSIX on non-POSIX
} use posix calls internally where possible
(64-bits)
! Long doubles: figure out where the PV->NV->PV conversion gets it
! wrong at least in AIX and Tru64 (V5.0 and onwards)
I have a sneaking suspicion this is something Spider Boardman's
looking at. I also have a sneaking suspicion I just made that up
out of thin air.
# Verify complete 64 bit support so that the value of sysseek,
# or C<-s>, or stat(), or tell can fit into a perl number without
# losing precision.
What's the latest on this?
(Signals)
Today's minefield is:
! custom opcodes
! alternate runops() for signal despatch
! figure out how to die() in delayed sighandler
! make Thread::Signal work under useithreads
(Locales)
} deprecate traditional/legacy locales?
} How do locales work across packages?
} figure out how to support Unicode locales (ICU/iconv)
Anyone want to comment on any of these?
(Security)
! use fchown, fchmod (and futimes?) internally when possible
! use fchdir(how portable?)
How are we doing with these? Do we have configure tests in
place, and is someone prepared to go on a search-and-destroy
mission?
! create secure reliable portable temporary file modules
Well, we now appear to have one of these, File::Temp.
Is this going in core?
! audit the standard utilities for security problems and fix them
Tom, you were looking at this - how much is done, how much is
still screamably dangerous?
(Regexps)
# Interpolated regex performance bugs
Fixed with qr// - can it be taken out of perltodo
or does it need more documentation?
# Regular Expression debugger
We have this now, don't we?
} Rewrite regexp parser for better integrated optimization
Huh?
! make RE engine thread-safe
! a way to do full character set arithmetics
! approximate matching
(Debugger)
} possible (debugger) pragma
# Debugger attach/detach
} support in perlmain to rerun debugger
} modifiable $1 et al
(Ports)
# MacPerl
Any news on the reintegration?
! Win32 stuff:
! sort out the spawnvp() mess for system('a','b','c') compatibility
! work out DLL versioning
(Optimisations)
! mmap for speeding up input?
Configure probes for this, I think, so it's a matter of plugging
it into the right places. And knowing where the right places are.
} constant function cache
} rcatmaybe
} shrink opcode tables via multiple implementations selected in peep
Can someone provide more information on what's required?
} foreach(reverse...)
I wonder how this can be done. You'd have to delay evaluation of the
range until runtime.
} optimize away @_ where possible
There is work going on with this, isn't there?
} tail recursion removal
Yikes. That's a biggy. Anyone thinking about it?
} cache eval tree (unless lexical outer scope used (mark in &compiling?))
} cache hash value? (Not a win, according to Guido)
} "one pass" global destruction
} LRU cache of regexp: foreach $pat (@pats) { foo() if /$pat/ }
(Misc. Internals)
Here's the mixed bag:
} switch structures
Everyone wants a switch statement but nobody can agree on the right
way to do it.
# C<magic_setisa> should be made to update %FIELDS
I have no idea what this means. Is it a scary pseudohash thing?
# there was talk of a mark-and-sweep garbage collector at TPC2
Is this idea still alive or dead?
# Make XS easier to use
There's now an xstut. SWIG exists and some people are using it
although I personally can't abide it. What else can be done?
# Make embedded Perl easier to use
Similarly, we now have storming documentation on embedding
Perl.
! floating point handling: nans, infinities, fp exception masks, etc.
! fix the basic arithmetics (+ - * / %) to preserve IVness/UVness if
! both arguments are IVs/UVs
! sendmsg, recvmsg? (Configure doesn't probe for these but the units exist)
! setitimer, getitimer? (the metaconfig units exist)
(Modules)
Wanted:
} VecArray Implement array using vec()
Nat has this, yes?
} SubstrArray Implement array using substr()
} VirtualArray Implement array using a file
} ShiftSplice Defines shift et al in terms of splice method
IIRC, Mr. Schwern has one of these, but I forget which.
# y2k localtime/gmtime
This is D'oh::Year. Can we scrub it from perltodo?
} use less
Optimisation tradeoffs. Do we *have* any tradeoff areas
at the moment we could use for this?
# Bundled modules
Storing modules in zips. Can this be fixed with source filters?
# Alternative RE Syntax
A module for creating regular expressions. I'd like to
work on this if anyone thinks it would be valuable.
} gettimeofday
# See Time::HiRes.
! sub-second sleep()? alarm()? time()?
These are all vaguely related.
! turn Cwd into an XS module?
Any volunteers?
} ExtUtils::CppSymbol?
Extract cpp symbols for use in Errno, Fcntl, POSIX
et al.
} Devel::MProf
Don't we have one of these now?
# Automatic tests against CPAN
This refers to testing CPAN modules with a new Perl. I've
been doing this manually, but an automated solution would
be cool, and shouldn't be too hard if you make judicious
use of the L<CPAN> module.
# Procedural options
Turning IO::* and friends into procedural interfaces. Is
this really wanted, and is someone working on it?
# RPC
Uh, isn't there something on CPAN that can do this?
# Make File::Find export C<$name> etc manually
Ten or fifteen minutes work for someone with the motivation,
I'd think.
# Finish a proper Ioctl module.
What's improper about the current one?
# perfect a Perl version of expect
Work with IO::Tty and put these in core replacing Comm.pl
Is this really what we want/need?
# GUI::Native
Sounds nasty. And big.
# Update semibroken auxiliary tools; h2ph, a2p, etc.
What's the status of these now?
! add new modules (Archive::Tar, Compress::Zlib, CPAN::FTP?)
! upgrade to newer versions of all independently maintained modules
# Automate the checking of versions in the standard distribution
(POD)
# Brad's PodParser code needs to become part of the core, and the Pod::*
# and pod2* programs rewritten to use this standard parser.
! replace pod2html with new PodtoHtml? (requires other modules from CPAN)
There've been a lot of changes in the various podlators and POD
parsers. I'm not sure any of the Todos are accurate any more.
POD people, what's the status?
# Podchecker
Now exists, yes.
# Separate function manpages by default
splitman to create man 3p pages for functions/operators and
install by default. Is this a good idea? Will you do it?
# Users can't find the manpages
Fixing manpaths to include the Perl man pages to. No idea how
to do this.
# Install ALL Documentation
We do this, now, don't we?
# Adapt www.linuxhq.com for Perl
Nat, *WHAT*?
# Replace man with a perl program
Tom has done something about this, I think, but IIRC Larry
wasn't in favour. Should it be taken off perltodo?
Reorganising perldoc:
# Include a search tool
The idea is that all there's an index to all the POD pages by
keyword, (presumably autogenerated from individual pages) and
we can look stuff up by keyword. Has this found favour?
# Include a locate tool
I have this, but it's not integrated into perldoc. If people
want, I'll combine it in.
(Documentation)
! reorg tutorials vs. reference sections
This is mentioned in a number of places. How to do it? .ref and .tut
extensions for reference and tutorial documentation respectively?
Split into different directories? ...
# Unicode tutorial
A tutorial counterpart to the perlunicode stuff would be good, yes.
# move operator reference into perlfunc.
Operators are functions. Well, actually, I'm in the camp that says
functions are operators. But anyway, a single perl built-ins
reference would be A Good Idea.
# Regular expressions (tutorial)
Robin Berjon has, it appears, volunteered. Robin, any luck so far?
# I/O (tutorial)
It seems this is in the capable (yet busy) hands of Mark-Jason
Dominus.
# pack/unpack (tutorial)
Yes. Please. Would anyone like to write one?
# Debugging (tutorial)
# Ronald Kimball (rjk@linguist.dartmouth.edu) has volunteered.
How's that going?
} comprehensive perldelta.pod
I'm happy to work on this if it's not happening.
I'm also happy to write, if people think they would be useful:
perlhiaw - Perl: How it all works
perltuple - Unicode tuples syntax and semantics
! describe new age patterns
What's outstanding here?
! update perl{guts,call,embed,xs} with additions, changes to API
We now have an autogenerated perlapi, but what's missing in these
documents?
! convert more examples to use autovivified filehandles
! document Win32 choices
! spot-check all new modules for completeness
Not sure where we are with these.
(Infrastructure)
# Mailing list archives
`Chaim suggests contacting egroup'. Has this been done?
# Regression Tests
# Brent LaVelle (lavelle@metronet.com)
} regression/sanity tests for suidperl
Brent, are you still out there and working on these? I know
it's an on-going thing, but how's it going?
Note use of __DIE__ hook to provide error report.
(Misc. Misc.)
# Design a webperl environment
I don't see what this is getting at.
# More work on a safe and secure execution environment
# for mobile agents
Is Safe not enough? What's left? What about safeperl?
} pack "(stuff)*", "(stuff)?", "(stuff)+", "(stuff)4"
} contiguous bitfields in pack/unpack
Anyone up for this?
} lexperl
I've been boggling at this for six months now, and I
still have no idea what it's driving at. What's
the intention?
} bundled perl preprocessor/macro facility
Source filters make this easy, but has anyone done it?
} format BOTTOM / report HANDLE
What would report HANDLE do? Produce an entire page?
} -i rename file only when successfully changed
This would probably mean a "dirty" flag or some such.
} built-in globbing
This is sort of done, isn't it?
} structured types
Is this Class::Struct, or something else?
} autocroak?
Fatal.pm?
} more generalized want()/caller()?
I'd certainly like to see a wantlvalue or equivalent.
I was going to work on it, but, uh, haven't. Any ideas
on what's required? How should we extend caller(), if
at all?
} named prototypes: sub foo ($foo, @bar) { ... } ?
} lexically scoped functions: my sub foo { ... }
These are both scary as hell.
} make tr/// return histogram in list context?
If anyone wants to do it, I had an embryonic patch
which I posted a while back.
} ref function in list context?
What would it do?
} loop control on do{} et al
Is this still wanted, and if so, is there a volunteer?
} all ARGV input should act like <>
When doesn't it?
} iterators/lazy evaluation/continuations/first/
} first_defined/short-circuiting grep/??
Aaaargh.
} a way to make << and >> to shift bitvectors instead of numbers
This makes sense to someone.
All comments welcome, both on p5p and by email. If anything
comes of this, I'll report back in a month with an updated
list.
Simon