develooper Front page | perl.perl5.porters | Postings from May 2008

This Week on perl5-porters - 18-24 May 2008

Thread Next
David Landgren
May 30, 2008 14:23
This Week on perl5-porters - 18-24 May 2008
Message ID:
This Week on perl5-porters - 18-24 May 2008

   "Ah, more details about filenames. Well, this sounds positively weird.
   Octet strings are not particularly user-friendly if you can't
   interpret them as characters reliably.

   From what you say, and what I think I've heard elsewhere, Unix
   filename interpretation is a mess. Seems like the only bigger mess
   I've heard about is VMS file handling, where they seem to have a
   choice of several messes." -- Glenn Linderman, deep in the heart of
   Unicode, case conversion, filenames, encodings, character sets, ß and
   other exciting issues.

Topics of Interest

Another perldoc shortcut

   Tom Christiansen commented on Gisle Aas's perldoc shortcut (that
   "perldoc ipc" would redirect to "perlipc", assuming no ipc.{pod,pm}
   existed), saying that in pre-5.8 times he had been working on a
   technique to make "perlipc" itself, run from the command-line, do the
   same thing. Somewhere along the line, things went astray and the work
   never made it the core.

     not bitter, not really

"File::Path::mkpath()" incompatibility in perl-5.10

   I had expected to make some progress on this issue, this week, but
   Real Life is eating my tuits like popcorn at the moment.

     next week, cross my heart

On the almost impossibility to write correct XS modules

   I might preface this thread "on the almost impossibility to write a
   correct summary of a complex subject". Marc Lehmann had written a few
   weeks ago that a bare "char *" through an XS API is fraught with
   peril, because there is no metadata available to tell you if it's
   Latin-1, KOI8-R, UTF-8 or something else.

   The thread blossomed this week, with a long-running debate about what
   is broken (and when, and how). One point that was made is that Win32
   encodes filenames in a particular way that doesn't really jibe with
   the rest of the internals. Unfortunately, it is only with hindsight
   that the problem really became apparent, hence the dilemma is that
   fixing it would break everything that has tried, with various degrees
   of success, to work around it.

   The "utf8" flag on SVs was again singled out as being responsible for
   world hunger and other assorted ills, with a number of examples
   demonstrating the problems.

   Rafael Garcia-Suarez outlined an approach that just may be a way
   forward out of the mess. After listening to Juerd Waalboer, he thought
   that marking an SV as "binary" and thereby disqualified from being
   upgraded to Unicode would be quite useful.

   Glenn Lindemann invented "blorf" as an opaque token for discussing the
   issues without people getting sidetracked over definitions of bytes,
   strings, characters, numbers and codepoints.

     hard core

It's wafer thin!

   David Nicol's tiny patch to document the empty pattern ("m//") more
   clearly sparked a fairly intense technical debate over how to get rid
   of the latter.

   One point of particular interest was when Aristotle Pagaltzis
   suggested a "s///R" modifier which would return a modified copy of the
   original string, instead of modifying the contents and returning the
   number of matches made.

   As it turns out, this would solve a number of problems very nicely,
   not the least being the elegantly succinct

     my @changed = map { s/$this/$that/R } @list;

     so let's have it already

Compiling 5.10 with g++ 4.3.0

   Not content with compiling perl with old gcc compilers, Bram took a
   very new one for a spin to see how things worked out.

   It did of course go *boom* (otherwise you probably wouldn't be reading
   about it). Bram traced the problems down to typedefs and enums in
   system headers, and wondered how in Configure this could be sorted

     duty now for the future

"Getopt::Long", + options, installperl and +v

   Nicholas Clark was looking how to factor out the common code in
   "installman" and "installperl" and noticed that the main sticking
   point regarding "installperl" was that it admitted a "+v" switch (and
   it does something else than "-v"), using hand-rolled @ARGV processing.

   This precludes it from using "Getopt::Long" because, while
   "Getopt::Long" can be taught to accept "-x" and "+x", it offers no way
   of discriminating between the two.

   Johan Vromans said that as it turns out, with a bit of hand-holding,
   it is possible to coax the information out as things stand, and he
   plans to improve support for - and + switches in a future release.

   Nicholas thought that a middle path might be to keep the hand-rolled
   code, but adjust it to dump its results into an %opts hash, which
   would allow a drop-in replacement when "Getopt::Long" gets updated
   with the needed functionality.

   This brought forth a long discourse from Tom Christiansen, who
   admitted to the wrong kind of laziness regarding command-line switches
   by resorting to hand-rolling code to deal with a solitary switch when
   in fact it would have been better to rely on a module. When he quizzed
   Larry Wall about it during the first decade of Perl's development,
   Larry admitted to rolling his own frequently, since it seemed a bit of
   a waste in his eyes to pull in a module for just one or two or
   switches for a program little more than a one-liner. As a peace
   offering for his own hand-rolling sins, Tom offered the list the
   ultimate file renaming Perl program.

     bespoke options

On broken manpages, trolling, inconsistent implementation and the 
difficulty to fix bugs

   Marc Lehmann wrote a long response to Jan Dubois as a spin-off from
   the "On the impossibility of writing XS correctly", stating that
   Perl's Unicode handling because some parts of the core deal with
   Unicode one way, and other parts another way. This leads to annoying
   bugs, in that they are hard to identify, and hard to fix.

   Tom Christiansen called him out for excessive use of rhetoric and
   asked him to clarify a couple of points. Several messages later Yves
   Orton offered a nice summary of the situation that showed where things
   break down. Then people started to speak about encodings, bytes,
   characters and character sets and as usual my eyes began to acquire
   that dead fish look.

     see also

On the problem of strings and binary data in Perl

   On the subject of subjects on the problem of things, Yves Orton broke
   out into a new thread to discuss the schizophrenic attitude that Perl
   has when dealing with strings. He put forward a proposal for
   identifying and processing Unicode strings asked people to point out
   where he was wrong. Rafael Garcia-Suarez made a decent effort at doing
   just that.

   Juerd Waalboer provided a contrarian argument, suggesting that Unicode
   works pretty well in Perl, insofar as one can have strings containing
   Unicode, and other strings containing binary data, because in a
   correct program, one usually doesn't have the two appearing in the
   same string. (such as having the Thai-encoded name of a Thai person
   concatenated with the slurped contents of a PNG file representing his
   signature in the same Perl scalar). In Juerd's eyes, the main problems
   come about when dealing with pure binary data and hoping that it
   doesn't wind of being treated as Unicode when it shouldn't.

     more recommended reading

   As a followup to the above discussion, Juerd announced that he had
   released BLOB to CPAN.

"" alias for "%+"

   Amir Elisha Aharoni ventured for the first time into the waters of
   p5p, suggesting that %NAMED_CAPTURE would be a nice English name for
   the new 5.10 "%+" variable. Yves Orton thought the idea was worthy of
   consideration, but one also needed to deal with "%-" at the same time,
   which could be named %MAMED_CAPTURE_LIST.

     updating the babelfish

07arith.t failing on "_strptime('2001-2-29 12:34:56','%Y-%m-%d %H:%M:%S')"

   February 29, 2001 was not a leap year, so trying to format it is an
   error. Apparently there is a test in "Time::Piece" to ensure it fails
   in the correct manner. Unfortunately, on some of the more exotic
   platforms like VMS and OS/X, the call also correctly fails, but does
   so in a way that fools the test suite.

     at the third stroke it will be the 32nd of february

   Gisle Aas gave some additional background regarding Time-Piece-1.13
   test failures on HP-UX, by forwarding a message he sent to Matt
   Sergeant, the author of "Time::Piece".

Some smoke digging (HP-UX failures)

   H.Merijn Brand delved into HP-UX smoke reports to figure out what was
   going wrong. "Time::Piece" was already under control (see above), but
   "Math::Trig" was failing (and the only recent change has been an
   upgrade to "Math::Complex"). Tests for "readdir" were also turning
   black, which suggested subtler problems.

   Half way through the conversation, Craig Berry announced the
   integration of Gisle Aas's fix for "Time::Piece" which addressed the
   VMS problems, and H.Merijn reported that it did the trick for HP-UX as
   well. Using the power of CPAN, H.Merijn was able to go through
   previous "Math::Complex" versions, and this allowed him to resolve
   that problem.

   I think the "readdir" problem was solved by upgrading smoke harness.

   The remaining failure appeared to be caused by "use blib" hoisting in
   an errant directory into @INC. Bram showed him how to fix that, which
   should nail down the last error.

     going for O O O O

   But then H.Merijn reported a problem with a failing blib test, and
   everyone pretended to pay attention to the character encoding debates.

     war knocked

TODO of the week

Improve the coverage of the core tests

   Use "Devel::Cover" to ascertain the core modules's test coverage, then
   add tests that are currently missing.

   Just to help budding testers along, here is a non-exhaustive list of
   suggestions to get you going (suggested by sorting out the biggest
   ".pm" files is lib/):


   Even concentrating on a single module would be helpful.

Patches of Interest

"ExtUtils::ParseXS" - Error reporting problem with INTERFACE and ALIAS 

   About a year ago, Ken Williams explained that, while he was the
   maintainer of this module, he didn't know what was the best way to
   address the problem that Robert May had brought up regarding error


   Of the two approached supplied by Robert as a solution, Ken liked the
   second one back then, and Nicholas Clark, reviving the conversation
   agreed that it seemed to make more sense.

   He had a look at how things work currently, and realised that with a
   new function, he could effect a small saving of space. As a result,
   both the core and "EU::PXS" could rely on the function.

   Nicholas wrote the function, and felt that it would make it into 5.8.9
   and 5.10.1. or older releases, "ExtUtils::ParseXS" would need to
   bundle the function, and emit it as required if the core didn't supply

   Rob thought that this sounded reasonable, except that if ever a bug is
   found in the function that Nicholas just wrote, it would need to be
   fixed both in the core and EU::PXS. Since this would be less that
   desirable, Robert said that he would try to come up with an alternate
   patch at some point.


"" should not warn about loading ".par" files

   Paul Fenwick noted that a "use lib 'Foo.par'" will issue a warning,
   but load the damned thing anyway. Since someone pulling in a library
   in this way probably has a pretty good idea what they're doing anyway,
   Paul thought it would be a good idea to suppress the warning, just for
   ".par" files.

   Rafael Garcia-Suarez felt that this made sense, so he applied the
   patch. Steffen Müller wanted to know if this meant that would
   be dual-lifed, so that 5.8.8 could benefit from the improvements.

     dual-life pragma on par

Indented preprocessor directives in sv.c

   Jerry D. Hedden noticed that some preprocessor defined in sv.c were
   not flush left, and thought that some compilers would choke on it.
   H.Merijn Brand explained that it was perfectly legal according to
   ANSI, although he admitted that some older compilers, such as on AIX,
   would likely get into trouble over this.

   Both Robin May and Andy Dougherty explained that something that does
   work is to leave the # in the first column, and then indent the macro
   preprocessor directive as appropriate.

     hash hard left

New and old bugs from RT

*x{IO} bizarre copying (#3314)

   Steve Peters discovered that some bizarre code that used to emit a
   bizarre error message now emits a more prosaic error message. He
   noticed that the change occurred way back in change #27179 and asked
   if anyone had objections to backporting it to 5.8.

     a leap into the unknown

"exists()": error message on wrong argument type is incorrect (#38955)

   A couple of years ago, Jeremy Hetzler noted that "exists" may be
   applied to a HASH, an ARRAY and also a subroutine name. The
   documentation even admits as much.

   On the other hand, for incorrect use, such as applying it to a scalar,
   the error message makes mention of only HASH and ARRAY, not of

   Bram patched the source to bring the error message into line with the
   documentation and implementation, and Rafael Garcia-Suarez applied it.

     language lawyers rejoice

No complaint about bareword (#53806)

   Rafael Garcia-Suarez supplied a fix for the "print Does::Not::Exist,
   ''" problem, so that the bareword is correctly identified as such, and
   not stringified. Despite all the magic surrounding "print"'s first
   argument, all that Rafael needed to do was to hoist a goto label four
   lines higher in the source.

   H.Merijn Brand applied the correction, along with Bram's tests.

"pod2man" loses =head2 starting ' or . (#53910)

   Bram correctly identified "Pod::Man" as a dual-life module. This means
   that the best place to fix this particular problem is in the CPAN
   distribution, which can then be synched with blead when the problem is


"IO::Seekable" + "POSIX" = constant subroutines redefined (#54186)

   Part of the fallout from Nicholas Clark's corrections for this bug is
   that calls with the wrong numbers of arguments causes the program to
   croak. Rafael Garcia-Suarez felt it was safe enough to inflict on the
   world. As a point of confirmation, Sébastien Aperghis-Tramoni ran a
   code search and didn't find any examples of such usage.

     safe to break

"perlipc" problems

   Andrew at Sundale noted a problem in the documentation in "perlipc"
   concerning the signalling of negative process IDs. Steve Peters
   tweaked the example to show more clearly what was happening.

     perlipc and negative pids (#54412)

   Andrew found another problem with "setsid", in that that the
   documentation suggests a "setsid or die" idiom, except that, if one
   reads the manpage for "setsid", one learns that it returns -1 on error
   (as do many other system calls). As such, if the "setsid" call fails,
   the die won't be triggered.

     perlipc and negative truth (#54422)

   While we're on the subject, Andrew found one final problem concerning
   the documentation for safe pipe opens.

     perlipc unclear on the concept (#54424)

Faulty "select()" in Activestate perl (#54544)

   Marc Lehmann noted that "select" returns "Unknown Error (10022)"
   instead of simply timing out.

     just no it

Assertion failure fiddling with @ISA (#54566)

   Niko Tyni discovered a way of abusing @ISA that would result in an
   assertion failure. Rafael Garcia-Suarez figured out what was going
   wrong in mg.c and provided a patch, that H.Merijn Brand applied.

     out through the smtp tunnel

"Can't take log of 0" error in perl 5.8.8. 64 bit (#54590)

   Lourdes Peña Castillo reported that on some versions of perl, but not
   others, the number 2.5e-310 gets rounded down to 0, and the log of 0
   is negative infinity.

   Various porters reported similar behaviour on a variety of perls,
   platforms and Configure options, but no clear reasons why.

     now you see it, now you don't

"PerlIO::via" free unrefed scalar on certain dodgy code (#54686)

   Kevin Ryde wrote some slightly broken code that managed to make the
   perl interpreter complain about memory problems. He wasn't especially
   worried about a fix any time soon, but wondered if it was a symptom of
   an underlying problem that needed to be addressed.

     need to know

Regexp modifier to disable interpolation like m'' (#54702)

   Ed Avis filed a feature enhancement request, to allow the "/n" flag on
   a regular expression to indicate that no interpolation should be

   Currently, only "m'300 $US'" (with single quotes as a pattern
   delimiter) does no interpolation. Ed thought that "/300 $US/n" might
   be clearer.

     we'll get the whole alphabet in some day

"PathTools-3.27" triggers a bug in Perl (#54728)

   Jan Dubois isolated a problem in "File::Spec::Win32"'s "catfile"
   function. The fix from the client side is to stringify a $1 passed as
   a parameter (a variation on the "better to be paranoid than sorry"
   theme), since "catfile" appears to clobber it with some other action
   before getting around to using it. Ideally, "catfile" should stringify
   its arguments itself, although Jan wondered if there was a more
   general way of solving the problem.

     match point

Perl5 Bug Summary

     278 new + 1345 open = 1623 (+13 -43)

New Core Modules

       Jerry D. Hedden released 2.08, which adds a few checks for
       undefined parameters.

In Brief

   Ricardo Signes wondered why "delete local $hash{elem}" didn't work
   when "local $hash{elem}; delete $hash{elem}" did. After boggling
   briefly over the syntax, Rafael Garcia-Suarez thought it wouldn't be
   too hard to make it work.

   Ricardo Signes looked at the documentation in "perlobj" and corrected
   errors and omissions in "DOES". He hinted that he would take the axe
   to the documentation for "UNIVERSAL".

     less is more

   Jerry D. Hedden corrected a typo in perlop.pod that H.Merijn Brand
   estimated as being a difference of about 3 pixels, thus possibly
   qualifying for the smallest patch ever.

   He also silenced build warnings in universal.c.

   Nicholas Clark discovered what he thought was a "usage error in XS
   subs" with the ALIAS keyword. This reminded Robert May that he had
   written about a similar problem with INTERFACE last year, and that the
   message had gone nowhere.

   Florian Ragwitz also managed what was roughly a seven pixel change to
   fix a documentation typo in "Attribute::Handlers".

   Artur Bergman handed over maintenance of "Attribute::Handlers" to
   Rafael Garcia-Suarez.

Moritz Lenz saw that "" refers to old title of "Higher Order Perl"
and changed the wording. There was some discussion as to whether the full
text of HOP was available on the web, and if so, where?

   After Steve Peters performed an upgrade to "AutoLoader" to bring it to
   5.66, Nicholas Clark bumped it up to 5.66_01 to be on the safe side.

     for the record

   Craig Berry returned to the "File::Copy" & permission bits issue,
   saying that changes were unlikely to fly on VMS. Aristotle Pagaltzis
   pointed out that on Windows, files tend to inherit their permission
   bits from the directory in which they reside, and that the only
   important bit to honour on Unix systems is the execute bit.

   Renée Bäcker was Warnocked over a patch to add more documentation to

About this summary

   This summary was written by David Landgren.

   Weekly summaries are published on and posted on a
   mailing list, (subscription: The
   archive is at Corrections
   and comments are welcome.

   If you found this summary useful, please consider contributing to the
   Perl Foundation or attending a YAPC to help support the development of

stubborn tiny lights vs. clustering darkness forever ok?

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About