Front page | perl.perl5.porters |
Postings from May 2008
This Week on perl5-porters - 18-24 May 2008
Thread Next
From:
David Landgren
Date:
May 30, 2008 14:23
Subject:
This Week on perl5-porters - 18-24 May 2008
Message ID:
48407053.1010304@landgren.net
This Week on perl5-porters - 18-24 May 2008
"Ah, more details about filenames. Well, this sounds positively weird.
Octet strings are not particularly user-friendly if you can't
interpret them as characters reliably.
From what you say, and what I think I've heard elsewhere, Unix
filename interpretation is a mess. Seems like the only bigger mess
I've heard about is VMS file handling, where they seem to have a
choice of several messes." -- Glenn Linderman, deep in the heart of
Unicode, case conversion, filenames, encodings, character sets, ß and
other exciting issues.
Topics of Interest
Another perldoc shortcut
Tom Christiansen commented on Gisle Aas's perldoc shortcut (that
"perldoc ipc" would redirect to "perlipc", assuming no ipc.{pod,pm}
existed), saying that in pre-5.8 times he had been working on a
technique to make "perlipc" itself, run from the command-line, do the
same thing. Somewhere along the line, things went astray and the work
never made it the core.
not bitter, not really
http://xrl.us/bk9mc
"File::Path::mkpath()" incompatibility in perl-5.10
I had expected to make some progress on this issue, this week, but
Real Life is eating my tuits like popcorn at the moment.
next week, cross my heart
http://xrl.us/bk9me
On the almost impossibility to write correct XS modules
I might preface this thread "on the almost impossibility to write a
correct summary of a complex subject". Marc Lehmann had written a few
weeks ago that a bare "char *" through an XS API is fraught with
peril, because there is no metadata available to tell you if it's
Latin-1, KOI8-R, UTF-8 or something else.
The thread blossomed this week, with a long-running debate about what
is broken (and when, and how). One point that was made is that Win32
encodes filenames in a particular way that doesn't really jibe with
the rest of the internals. Unfortunately, it is only with hindsight
that the problem really became apparent, hence the dilemma is that
fixing it would break everything that has tried, with various degrees
of success, to work around it.
The "utf8" flag on SVs was again singled out as being responsible for
world hunger and other assorted ills, with a number of examples
demonstrating the problems.
Rafael Garcia-Suarez outlined an approach that just may be a way
forward out of the mess. After listening to Juerd Waalboer, he thought
that marking an SV as "binary" and thereby disqualified from being
upgraded to Unicode would be quite useful.
Glenn Lindemann invented "blorf" as an opaque token for discussing the
issues without people getting sidetracked over definitions of bytes,
strings, characters, numbers and codepoints.
hard core
http://xrl.us/bk9mg
It's wafer thin!
David Nicol's tiny patch to document the empty pattern ("m//") more
clearly sparked a fairly intense technical debate over how to get rid
of the latter.
One point of particular interest was when Aristotle Pagaltzis
suggested a "s///R" modifier which would return a modified copy of the
original string, instead of modifying the contents and returning the
number of matches made.
As it turns out, this would solve a number of problems very nicely,
not the least being the elegantly succinct
my @changed = map { s/$this/$that/R } @list;
so let's have it already
http://xrl.us/bk9mi
Compiling 5.10 with g++ 4.3.0
Not content with compiling perl with old gcc compilers, Bram took a
very new one for a spin to see how things worked out.
It did of course go *boom* (otherwise you probably wouldn't be reading
about it). Bram traced the problems down to typedefs and enums in
system headers, and wondered how in Configure this could be sorted
out.
duty now for the future
http://xrl.us/bk9mk
"Getopt::Long", + options, installperl and +v
Nicholas Clark was looking how to factor out the common code in
"installman" and "installperl" and noticed that the main sticking
point regarding "installperl" was that it admitted a "+v" switch (and
it does something else than "-v"), using hand-rolled @ARGV processing.
This precludes it from using "Getopt::Long" because, while
"Getopt::Long" can be taught to accept "-x" and "+x", it offers no way
of discriminating between the two.
Johan Vromans said that as it turns out, with a bit of hand-holding,
it is possible to coax the information out as things stand, and he
plans to improve support for - and + switches in a future release.
Nicholas thought that a middle path might be to keep the hand-rolled
code, but adjust it to dump its results into an %opts hash, which
would allow a drop-in replacement when "Getopt::Long" gets updated
with the needed functionality.
This brought forth a long discourse from Tom Christiansen, who
admitted to the wrong kind of laziness regarding command-line switches
by resorting to hand-rolling code to deal with a solitary switch when
in fact it would have been better to rely on a module. When he quizzed
Larry Wall about it during the first decade of Perl's development,
Larry admitted to rolling his own frequently, since it seemed a bit of
a waste in his eyes to pull in a module for just one or two or
switches for a program little more than a one-liner. As a peace
offering for his own hand-rolling sins, Tom offered the list the
ultimate file renaming Perl program.
bespoke options
http://xrl.us/bk9mn
On broken manpages, trolling, inconsistent implementation and the
difficulty to fix bugs
Marc Lehmann wrote a long response to Jan Dubois as a spin-off from
the "On the impossibility of writing XS correctly", stating that
Perl's Unicode handling because some parts of the core deal with
Unicode one way, and other parts another way. This leads to annoying
bugs, in that they are hard to identify, and hard to fix.
Tom Christiansen called him out for excessive use of rhetoric and
asked him to clarify a couple of points. Several messages later Yves
Orton offered a nice summary of the situation that showed where things
break down. Then people started to speak about encodings, bytes,
characters and character sets and as usual my eyes began to acquire
that dead fish look.
see also
http://xrl.us/bk9mp
On the problem of strings and binary data in Perl
On the subject of subjects on the problem of things, Yves Orton broke
out into a new thread to discuss the schizophrenic attitude that Perl
has when dealing with strings. He put forward a proposal for
identifying and processing Unicode strings asked people to point out
where he was wrong. Rafael Garcia-Suarez made a decent effort at doing
just that.
Juerd Waalboer provided a contrarian argument, suggesting that Unicode
works pretty well in Perl, insofar as one can have strings containing
Unicode, and other strings containing binary data, because in a
correct program, one usually doesn't have the two appearing in the
same string. (such as having the Thai-encoded name of a Thai person
concatenated with the slurped contents of a PNG file representing his
signature in the same Perl scalar). In Juerd's eyes, the main problems
come about when dealing with pure binary data and hoping that it
doesn't wind of being treated as Unicode when it shouldn't.
more recommended reading
http://xrl.us/bk9mr
As a followup to the above discussion, Juerd announced that he had
released BLOB to CPAN.
http://xrl.us/bk9mt
"English.pm" alias for "%+"
Amir Elisha Aharoni ventured for the first time into the waters of
p5p, suggesting that %NAMED_CAPTURE would be a nice English name for
the new 5.10 "%+" variable. Yves Orton thought the idea was worthy of
consideration, but one also needed to deal with "%-" at the same time,
which could be named %MAMED_CAPTURE_LIST.
updating the babelfish
http://xrl.us/bk9mv
07arith.t failing on "_strptime('2001-2-29 12:34:56','%Y-%m-%d %H:%M:%S')"
February 29, 2001 was not a leap year, so trying to format it is an
error. Apparently there is a test in "Time::Piece" to ensure it fails
in the correct manner. Unfortunately, on some of the more exotic
platforms like VMS and OS/X, the call also correctly fails, but does
so in a way that fools the test suite.
at the third stroke it will be the 32nd of february
http://xrl.us/bk9mx
Gisle Aas gave some additional background regarding Time-Piece-1.13
test failures on HP-UX, by forwarding a message he sent to Matt
Sergeant, the author of "Time::Piece".
http://xrl.us/bk9mz
Some smoke digging (HP-UX failures)
H.Merijn Brand delved into HP-UX smoke reports to figure out what was
going wrong. "Time::Piece" was already under control (see above), but
"Math::Trig" was failing (and the only recent change has been an
upgrade to "Math::Complex"). Tests for "readdir" were also turning
black, which suggested subtler problems.
Half way through the conversation, Craig Berry announced the
integration of Gisle Aas's fix for "Time::Piece" which addressed the
VMS problems, and H.Merijn reported that it did the trick for HP-UX as
well. Using the power of CPAN, H.Merijn was able to go through
previous "Math::Complex" versions, and this allowed him to resolve
that problem.
I think the "readdir" problem was solved by upgrading smoke harness.
The remaining failure appeared to be caused by "use blib" hoisting in
an errant directory into @INC. Bram showed him how to fix that, which
should nail down the last error.
going for O O O O
http://xrl.us/bk9m3
But then H.Merijn reported a problem with a failing blib test, and
everyone pretended to pay attention to the character encoding debates.
war knocked
http://xrl.us/bk9m5
TODO of the week
Improve the coverage of the core tests
Use "Devel::Cover" to ascertain the core modules's test coverage, then
add tests that are currently missing.
Just to help budding testers along, here is a non-exhaustive list of
suggestions to get you going (suggested by sorting out the biggest
".pm" files is lib/):
"AutoLoader"
"AutoSplit"
"Benchmark"
"Cwd"
"DB"
"Dumpvalue"
"Exporter"
"Memoize"
"NEXT"
"SelfLoader"
"charnames"
"diagnostics"
"overload"
"warnings"
Even concentrating on a single module would be helpful.
Patches of Interest
"ExtUtils::ParseXS" - Error reporting problem with INTERFACE and ALIAS
keywords
About a year ago, Ken Williams explained that, while he was the
maintainer of this module, he didn't know what was the best way to
address the problem that Robert May had brought up regarding error
reporting.
then
http://xrl.us/bk9m7
Of the two approached supplied by Robert as a solution, Ken liked the
second one back then, and Nicholas Clark, reviving the conversation
agreed that it seemed to make more sense.
He had a look at how things work currently, and realised that with a
new function, he could effect a small saving of space. As a result,
both the core and "EU::PXS" could rely on the function.
Nicholas wrote the function, and felt that it would make it into 5.8.9
and 5.10.1. or older releases, "ExtUtils::ParseXS" would need to
bundle the function, and emit it as required if the core didn't supply
it.
Rob thought that this sounded reasonable, except that if ever a bug is
found in the function that Nicholas just wrote, it would need to be
fixed both in the core and EU::PXS. Since this would be less that
desirable, Robert said that he would try to come up with an alternate
patch at some point.
now
http://xrl.us/bk9m9
"lib.pm" should not warn about loading ".par" files
Paul Fenwick noted that a "use lib 'Foo.par'" will issue a warning,
but load the damned thing anyway. Since someone pulling in a library
in this way probably has a pretty good idea what they're doing anyway,
Paul thought it would be a good idea to suppress the warning, just for
".par" files.
Rafael Garcia-Suarez felt that this made sense, so he applied the
patch. Steffen Müller wanted to know if this meant that lib.pm would
be dual-lifed, so that 5.8.8 could benefit from the improvements.
dual-life pragma on par
http://xrl.us/bk9nb
Indented preprocessor directives in sv.c
Jerry D. Hedden noticed that some preprocessor defined in sv.c were
not flush left, and thought that some compilers would choke on it.
H.Merijn Brand explained that it was perfectly legal according to
ANSI, although he admitted that some older compilers, such as on AIX,
would likely get into trouble over this.
Both Robin May and Andy Dougherty explained that something that does
work is to leave the # in the first column, and then indent the macro
preprocessor directive as appropriate.
hash hard left
http://xrl.us/bk9nd
New and old bugs from RT
*x{IO} bizarre copying (#3314)
Steve Peters discovered that some bizarre code that used to emit a
bizarre error message now emits a more prosaic error message. He
noticed that the change occurred way back in change #27179 and asked
if anyone had objections to backporting it to 5.8.
a leap into the unknown
http://xrl.us/bk9nf
"exists()": error message on wrong argument type is incorrect (#38955)
A couple of years ago, Jeremy Hetzler noted that "exists" may be
applied to a HASH, an ARRAY and also a subroutine name. The
documentation even admits as much.
On the other hand, for incorrect use, such as applying it to a scalar,
the error message makes mention of only HASH and ARRAY, not of
subroutines.
Bram patched the source to bring the error message into line with the
documentation and implementation, and Rafael Garcia-Suarez applied it.
language lawyers rejoice
http://xrl.us/bk9nh
No complaint about bareword (#53806)
Rafael Garcia-Suarez supplied a fix for the "print Does::Not::Exist,
''" problem, so that the bareword is correctly identified as such, and
not stringified. Despite all the magic surrounding "print"'s first
argument, all that Rafael needed to do was to hoist a goto label four
lines higher in the source.
H.Merijn Brand applied the correction, along with Bram's tests.
http://xrl.us/bk9nj
"pod2man" loses =head2 starting ' or . (#53910)
Bram correctly identified "Pod::Man" as a dual-life module. This means
that the best place to fix this particular problem is in the CPAN
distribution, which can then be synched with blead when the problem is
fixed.
SEP
http://xrl.us/bk9nm
"IO::Seekable" + "POSIX" = constant subroutines redefined (#54186)
Part of the fallout from Nicholas Clark's corrections for this bug is
that calls with the wrong numbers of arguments causes the program to
croak. Rafael Garcia-Suarez felt it was safe enough to inflict on the
world. As a point of confirmation, Sébastien Aperghis-Tramoni ran a
code search and didn't find any examples of such usage.
safe to break
http://xrl.us/bk9no
"perlipc" problems
Andrew at Sundale noted a problem in the documentation in "perlipc"
concerning the signalling of negative process IDs. Steve Peters
tweaked the example to show more clearly what was happening.
perlipc and negative pids (#54412)
http://xrl.us/bk9nq
Andrew found another problem with "setsid", in that that the
documentation suggests a "setsid or die" idiom, except that, if one
reads the manpage for "setsid", one learns that it returns -1 on error
(as do many other system calls). As such, if the "setsid" call fails,
the die won't be triggered.
perlipc and negative truth (#54422)
http://xrl.us/bk9ns
While we're on the subject, Andrew found one final problem concerning
the documentation for safe pipe opens.
perlipc unclear on the concept (#54424)
http://xrl.us/bk9nu
Faulty "select()" in Activestate perl (#54544)
Marc Lehmann noted that "select" returns "Unknown Error (10022)"
instead of simply timing out.
just no it
http://xrl.us/bk9nw
Assertion failure fiddling with @ISA (#54566)
Niko Tyni discovered a way of abusing @ISA that would result in an
assertion failure. Rafael Garcia-Suarez figured out what was going
wrong in mg.c and provided a patch, that H.Merijn Brand applied.
out through the smtp tunnel
http://xrl.us/bk9ny
"Can't take log of 0" error in perl 5.8.8. 64 bit (#54590)
Lourdes Peña Castillo reported that on some versions of perl, but not
others, the number 2.5e-310 gets rounded down to 0, and the log of 0
is negative infinity.
Various porters reported similar behaviour on a variety of perls,
platforms and Configure options, but no clear reasons why.
now you see it, now you don't
http://xrl.us/bk9n2
"PerlIO::via" free unrefed scalar on certain dodgy code (#54686)
Kevin Ryde wrote some slightly broken code that managed to make the
perl interpreter complain about memory problems. He wasn't especially
worried about a fix any time soon, but wondered if it was a symptom of
an underlying problem that needed to be addressed.
need to know
http://xrl.us/bk9n4
Regexp modifier to disable interpolation like m'' (#54702)
Ed Avis filed a feature enhancement request, to allow the "/n" flag on
a regular expression to indicate that no interpolation should be
performed.
Currently, only "m'300 $US'" (with single quotes as a pattern
delimiter) does no interpolation. Ed thought that "/300 $US/n" might
be clearer.
we'll get the whole alphabet in some day
http://xrl.us/bk9n6
"PathTools-3.27" triggers a bug in Perl (#54728)
Jan Dubois isolated a problem in "File::Spec::Win32"'s "catfile"
function. The fix from the client side is to stringify a $1 passed as
a parameter (a variation on the "better to be paranoid than sorry"
theme), since "catfile" appears to clobber it with some other action
before getting around to using it. Ideally, "catfile" should stringify
its arguments itself, although Jan wondered if there was a more
general way of solving the problem.
match point
http://xrl.us/bk9n8
Perl5 Bug Summary
278 new + 1345 open = 1623 (+13 -43)
http://xrl.us/bk9oa
http://rt.perl.org/rt3/NoAuth/perl5/Overview.html
New Core Modules
"Thread::Semaphore"
Jerry D. Hedden released 2.08, which adds a few checks for
undefined parameters.
http://xrl.us/bk9oc
In Brief
Ricardo Signes wondered why "delete local $hash{elem}" didn't work
when "local $hash{elem}; delete $hash{elem}" did. After boggling
briefly over the syntax, Rafael Garcia-Suarez thought it wouldn't be
too hard to make it work.
http://xrl.us/bk9oe
Ricardo Signes looked at the documentation in "perlobj" and corrected
errors and omissions in "DOES". He hinted that he would take the axe
to the documentation for "UNIVERSAL".
less is more
http://xrl.us/bk9og
Jerry D. Hedden corrected a typo in perlop.pod that H.Merijn Brand
estimated as being a difference of about 3 pixels, thus possibly
qualifying for the smallest patch ever.
http://xrl.us/bk9oi
He also silenced build warnings in universal.c.
http://xrl.us/bk9ok
Nicholas Clark discovered what he thought was a "usage error in XS
subs" with the ALIAS keyword. This reminded Robert May that he had
written about a similar problem with INTERFACE last year, and that the
message had gone nowhere.
http://xrl.us/bk9on
Florian Ragwitz also managed what was roughly a seven pixel change to
fix a documentation typo in "Attribute::Handlers".
http://xrl.us/bk9op
Artur Bergman handed over maintenance of "Attribute::Handlers" to
Rafael Garcia-Suarez.
http://xrl.us/bk9or
Moritz Lenz saw that "Memoize.pm" refers to old title of "Higher Order Perl"
and changed the wording. There was some discussion as to whether the full
text of HOP was available on the web, and if so, where?
http://xrl.us/bk9ot
After Steve Peters performed an upgrade to "AutoLoader" to bring it to
5.66, Nicholas Clark bumped it up to 5.66_01 to be on the safe side.
for the record
http://xrl.us/bk9ov
Craig Berry returned to the "File::Copy" & permission bits issue,
saying that changes were unlikely to fly on VMS. Aristotle Pagaltzis
pointed out that on Windows, files tend to inherit their permission
bits from the directory in which they reside, and that the only
important bit to honour on Unix systems is the execute bit.
http://xrl.us/bk9ox
Renée Bäcker was Warnocked over a patch to add more documentation to
attributes.pm.
http://xrl.us/bk9oz
About this summary
This summary was written by David Landgren.
Weekly summaries are published on http://use.perl.org/ and posted on a
mailing list, (subscription: perl5-summary-subscribe@perl.org). The
archive is at http://dev.perl.org/perl5/list-summaries/. Corrections
and comments are welcome.
If you found this summary useful, please consider contributing to the
Perl Foundation or attending a YAPC to help support the development of
Perl.
--
stubborn tiny lights vs. clustering darkness forever ok?
Thread Next
-
This Week on perl5-porters - 18-24 May 2008
by David Landgren