develooper Front page | perl.perl5.porters | Postings from July 2013

NWCLARK TPF grant May report

Thread Next
From:
Nicholas Clark
Date:
July 26, 2013 14:51
Subject:
NWCLARK TPF grant May report
Message ID:
20130726145057.GO4940@plum.flirble.org
As per my grant conditions, here is a report for the May period.

This month I simplified part of the implementation of pack and unpack,
removing about 130 lines of code, and reducing the object code size by about
2K. The only casualty was support for mixed-endian systems. Sorry, PDP-11
users.

In the medal stakes for "risk to sanity", the implementation of pack and
unpack are strong contestants, although *only* for bronze. (They're fighting
it out with the implementation of sprintf. Compared with sprintf, they have
the advantage that they've been seen as sufficiently dangerous to nearby
code that they've been quarantined into their own file. The regular
expression engine gets silver, and the reigning champion and gold medal
holder remains the parser itself.)

pack/unpack is one of those pieces of code which has grown organically as
features were added, and so has become too large to easily comprehend the
big picture. In particular, it's hard to look at it at more than one level
at once, and hence see where it could be profitably refactored.

The main code of pp_pack and pp_unpack are large switch statements, which
are used to dispatch a single template code, and perform the correct sized
and typed read or write of bytes to or from the string. There seems to be a
lot of repetition in there, but it's more at a "template" level than a code
level, because the types of the variables in each case differ. Pretty much
all the things that are the same size and could be merged have been
merged. There's no obvious way to squeeze it further with this design.

However, one thing did strike me as odd - a pattern of repeated macros in
the unpack code, particularly in all the cases that dealt with the various
sizes and signedness of integers. They exist due to the order that the code
was written and then enhanced. The original code (dating from Perl 3 era)
read numbers as chunks of memory. It copied bytes to/from the (possibly
unaligned) input string to a local variable of the correct type. Much much
later Marcus Holland-Moritz added code to provide the pack modifiers "<" and
">" (for endianness) and implemented the endian swapping they require, by
endian swapping the values in the integer variables. Subsequently Ton Hospel
added code to cope with the unpack input string having been internally
transcoded to UTF-8, by having the copy transcode back if needed. These two
later additions are respectively the second and first macros.

So there was one macro to read a number into a variable, and then a second
macro to (optionally) spin the bits to re-order for a different endianness.
This looked more complex than it might be. On digging further I realised
that it was more subtle than it first looked - the code didn't just support
the obvious big and little endian byte orders, but also mixed endian.
(Technically, actually, I think it could support any arbitrary endianness.)

Big endian and little endian are familiar - on a big endian system the 32
bit number 0x04030201 would be stored in memory as the bytes 0x04 0x03,
0x02, 0x01; on a little endian system as 0x01, 0x02, 0x03, 0x04.
Mixed endian is quirky - 0x03, 0x04, 0x01, 0x02

We haven't had many bug reports from PDP-11 users since, hmmm, actually
since 5.000 was shipped, which made me wonder just how much of the code
still worked, and if anyone at all would notice*.

So I went digging. Mixed-endian support first was first added in Perl 3, as
part of adding 'N' and 'n' templates to pack. Fallback functions were
provided in util.c in case the system didn't define htonl() and ntohl().

The fallback functions contain conditionally compiled special-case code for
little endian systems, and a fallback loop for other values of BYTEORDER.
The intent is that the the fallback loop will work on any byteorder. In
fact, it only works correctly on 32 bit little endian systems, because it
*always* swaps the order of the bytes within the word. So irony is that the
special-purpose little endian code is actually correct for all platforms,
and the intended-to-be generic code only works on little endian.

Hence I infer that they were never needed and hence never used used, because
if they had been, regression tests would have failed. Moreover, it seems
that no-one ever even needed to link with them, because as is noted in "perl
4.0 patch 19" (commit 988174c19bcf26f6, Nov 1991):

+/*
+ * I think my_swap(), htonl() and ntohl() have never been used.
+ * perl.h contains last-chance references to my_swap(), my_htonl()
+ * and my_ntohl().  I presume these are the intended functions;
+ * but htonl() and ntohl() have the wrong names.  There are no
+ * functions my_htonl() and my_ntohl() defined anywhere.
+ * -DWS
+ */

(DWS appears to be "David W. Sanderson", who provided code for 'V' and 'v')


Because I wasn't confident about my code inspection, and it all looked too
unlikely to be actually true, I tested some of this on an emulator. I don't
know how much a PDP-11 cost in its day, but it's amusing that you can
emulate it on Raspberry Pi. (Which is a lot smaller, uses considerably less
power, and costs $25).

The nice thing about Unix is that an ancient BSD is familiar enough to be
workable. The not so nice thing is that the C compiler misses a few things -
it only understands K&R argument style, doesn't do function prototypes, and
it's missing some *printf formats such as "%lX" and "%hx". (Which matter,
because int is 16 bits, and so there are a whole lot more size errors you
can't get away with for varargs functions such as printf.) There's no way
that that toolchain is going to build a perl from this decade, and I'm not
even sure if a viable C89 toolchain exists. So I didn't spend any more time
looking at it. (And it wouldn't have been trivial, as my "connectivity" with
the machine was typing and copy/paste on the logged in console. Getting
emulated networking set up would require root on the host machine figuring
out how to set up a bunch of IP-level games. This is a lot more complex than
running apt-get to install the emulator and then downloading the relevant
disk image for NetBSD.)

I made the executive decision that all the mixed endian support code can go.
Patches welcome to restore it, but we're only really interested if you have
a genuine reason to keep supporting PDP-11s going forward, and are in a
position to run a smoker.

With it gone, the simplification becomes tractable. Endianness is now either
"big endian" or "little endian", and the problem of "which endian?" reduces
to either "no-op" or "swap all the bytes". At which point, it's possible to
implement the endian-swap as part of the copy/transcode fixup, because the
number of bytes wanted is fixed and known, and the transformation is always
"reverse". So if endian swapping is needed, simply process those bytes from
the input in reverse order. The pair of macros (actually, an entire brace of
macros) simplify massively, to an inline call to memcpy() it if it's all
fine, and a call to a fixup function if it is not. This removes a lot of
inline code related to endian swapping, and results in the 2K object code
shrinkage. The common use cases remain inline, so the code shouldn't be any
slower (although I haven't tried to measure this).


A bug that consumed a lot of time this month was RT #118055. For v5.18.0
Father Chrysostomos significantly re-wrote the OP generation code to use a
slab allocation scheme, which solves many previous problems with memory
leaks when syntax errors cause compilation to abort. (The "blame" here seems
to be one of those subtle problems of using code in ways it wasn't
originally intended. yacc doesn't complicate its design by making efforts to
clean up on compilation failure, because it assumes that compilation failure
will mean process exit, and everything gets freed anyway. This holds true
for the main Perl program too - syntax errors abort. But Perl 5 gained the
ability to be embedded (where total cleanup matters), and is now often used
as a persistent process (where even the occasional failed eval will start to
add up), so the cost of the problem has slowly been growing.)

However, there was an unfortunate bug slab allocator none of us spotted.
We were actually being naughty with alignment - the slab allocator assumes
that everything in the OP structure only needs alignment with the platform's
pointer size, but because PMOP was using an IV when it should have be using a
pointer-sized integer this didn't always hold true.

The problem was only relevant on 32 bit platforms when optionally compiling
with 64 bit IVs and ithreads (neither of which is the default), and the only
platform we're aware of that enforces the alignment is sparc32.
Frustratingly it was only spotted (by the Debian Perl team) just after
v5.18.0 was released, else it would have been a release blocker.

Ideally we'd fix the header to change the structure to use the correct type
but that's obviously not viable for a maintenance fix, as it's not binary
compatible. Having chewed over various ways to work around the problem, all
of which felt complex, fragile or both and had the potential to be slower, I
spotted a different approach which was simple and could be constrained to
just the class of affected builds.

OP allocation cannot only be from slabs - for various reasons OPs need to be
allocated without a slab being available. Hence each OP already had a flag
saying whether it was from a slab, or had been allocated directly using
malloc(). malloc() will, by design, always return suitably aligned memory.
Hence the fix is for the affected configurations to conditionally compile in
code that intercepts allocations for PMOP, and always allocates them with
malloc(). It's 2 lines of C code, 2 lines of C pre-processor, and will be in
v5.18.1.


A second bug of note is RT #118139. Reini Urban figured out that Storable
can crash when used in DESTROY blocks which are being run during global
destruction, because it is accessing an already freed PL_modglobal or the
internal ctx.

I feel guilty about this, because I believe that it's the same problem that
I first hit in 2003 (while writing slides for the German Perl Workshop). I
saw that the C stacktrace of the SEGV meant that it was during global
destruction, and concluded roughly "OMG global destruction" and "I don't
have time to figure this out *and* write my slides", and so, strangely
enough elected to deal with my slides. And there it languished.

The problem with global destruction is that it has to destroy everything.
What's the problem with that? In a word, cycles. If objects A and B each
reference each other, then (obviously) B needs to stick around longer than
A, in case A's DESTROY needs to follow the reference to B for some reason.
But A needs to stick around longer than B, in case B's DESTROY...  "You go
first". "No, you go first". Global destruction cuts the Gordian knot by
destroying things in any order it feels like. The upshot is that anything
you needed might already be gone, so you code gets to live in interesting
times. So interesting that it may not actually be able to work. This, I had
assumed, was quite the difficulty with Storable - that a solution might not
exist - so I didn't investigate further.

Unlike me back then, Reini Urban hadn't given up so easily, and distilled it
to an excellent clear test case, and suggested a patch to avoid the SEGV, by
causing Storable to die if used during global destruction. (An exception
which quite possibly wouldn't be seen, but that's considerably better than a
SEGV.)

However, it gave Leon Timmermans an insight. Why is Storable using DESTROY?
It's not actually running any Perl code. It's just using a blessed reference
as a hook to ensure free() is called on some memory allocations in its
per-interpreter context. The problem is that during global destruction
references to that memory *are* still visible to Storable routines called
from Perl code, use after free bug.

So he figured that we can ensure the same cleanup is run by attaching magic
to the context scalar instead of blessing it. The mg_free hook is called
when all scalars are finally freed. This is much later during global
destruction, after all destructors have been called and no more Perl code is
running. Hence this ensures the desired ordering constraint - that the
context survives until after all code no longer needs it, but is freed
eventually. I implemented his suggestion, and verified that this alternative
solution also made Reini's test case pass. So this is the solution that we
went with, as it permits Storable to be used reliably during global
destruction, instead of reliably forbidding it.


I simplified the build infrastructure for the x2p/ subdirectory. x2p/
contains the various "to Perl" utilities, along with tools needed to build
them. In x2p/ it seems that the only consistency is inconsistency. The names
are a2p, s2p and find2perl (not f2p). The sed and find converters are
written in Perl. The awk converter is written in C. So it's not simply
repeat the same build rules for 3 things.

I think I've read somewhere that Larry considered the awk to Perl converter
sufficiently important for Perl adoption that he delayed the Perl 1 release
until a2p was done. The C code itself uses a yacc grammar to parse awk, and
looks suspiciously like Perl 1, down to using the same data structures for
scalars and the same stdio cheating code for fgets().
(See http://perl5.git.perl.org/perl.git/commit/perl-1.0 if you're interested)

However useful it was 25 years ago to drive Perl adoption, it's not needed
for that these days, but it is still the only awk to Perl converter, so it
would be nice to ship it independently somehow. Frustratingly, it's not at
all obvious how to do this. It's not XS code linking against the core, so
ExtUtils::MakeMaker doesn't have anything helpful. In fact, the situation is
worse than this - the shipped x2p/Makefile.SH is only used for the build on
*nix systems. It's not a true little self contained sub-perl, as VMS and
Win32 include rules directly in their top level Makefiles to build the
contents of the x2p directory. Additionally, it has no tests - for some
reason I feel more uncomfortable with the idea of shipping something to CPAN
with no tests than I do with the status quo. Maybe it's because if it's an
independent distribution it is completely obvious that it has no tests.

Tests would be useful. I've searched and failed to find any suitably
licensed for inclusion in the core. Effectively to ship dual APL/GPL code
needs to be either exactly that (unlikely), or BSD licensed, or public
domain. Which seems surprising, as there are BSD licensed awk
implementations.

So as it seems that evicting a2p is actually more work than continuing to
host it in the core distribution, I attempted to reduce duplication in the
build setup, even if that means I've bound it a bit more tightly. In that,
for the past 25 years x2p has been shipping a *second* copy of the
makedepend shell script that automates calculation of C file dependencies to
generate Makefile rules, and a second copy of the cflags shell script used
to invoke the compiler. This seems wasteful. However, it took a bit of
untangling to remove some assumptions from the top-level versions of each
about only dealing with the current directory. But once that was fixed it
was possible to rework x2p's Makefile to use the versions at the top level,
and delete its private forks.

I was also easily able to move running makedepend on x2p into its own rule
in the main Makefile, so that it can happen in parallel with C compilation
for miniperl. Previously it ran as part of the main makedepend, which
happens before make is invoked, meaning it's bottle necked on only one
core. Now a little more can happen concurrently, and maybe a second is
shaved off the elapsed build time for a parallel make.

In the x2p/ Makefile I spotted a few more tentacles of dead things that need
purging - rules that turned out to be for deleting files generated when
building on VM/ESA (an obsolete platform we have removed support for), rules
for generating targets related to the Perl compiler (removed in 2006), and
rules related to man page targets deleted by 5.000. They're not the first
dead things (eg installperl rules related to 5.000 to 5.001 changes), and
they won't be the last. But in something this complex, it's never easy to
spot what's actually dead, versus what's obscure but still needed.


Whilst on the subject of extermination, this month I finally removed
fakethr.h, and eliminated all references to it and FAKE_THREADS. fakethr.h
and FAKE_THREADS were for a "green" threads implementation of 5005threads.
5005threads itself is long gone, and it's not clear that -DFAKE_THREADS
*ever* built correctly. Certainly it did not work for the 5.005 release, and
it did not work at the time of the commits for the initial checkin. The
closest that it seems to have been to working is around commit
c6ee37c52f2ca9e5 (Dec 1997), where the headers no longer contained errors,
but perl.c failed to compile.

I had spotted this dead code a couple of years ago, but stalled on the
problem that ExtUtils::MakeMaker contained a hard-coded list of headers
which it uses to generate dependency rules for XS code, and fakethr.h was
one of them. Hence simply innocently removing the file will have the less
than desirable effect of causing all XS code to fail. (It's very easy to
break most of CPAN. It's a lot harder to *not* break CPAN)

Manually removing another line felt like a bodge, as it doesn't actually fix
the real problem, so with plenty of other things to make progress on, I took
no action. Fortunately as part of his hash work, Yves fixed the header list
in ExtUtils::Makemaker properly (in a way that had eluded me), such that
there is no longer list to maintain (and get stale). Hence, with that
pre-requisite unblocked, I carefully purged the code once v5.18.0 shipped.
So that's the last part of 5.005 threads removed. That is, until the next
part is found.


As I've been removing code for obsolete platforms, I've been searching the
entire codebase for references to them. It's impressive how many nooks and
crannies need to be checked to clean up properly, and not surprising that
sometimes the odd reference get missed. (Just like fakethr.h, hiding in
plain sight for a decade.) Part of the fun is because platform specific
README files are installed as man pages. For example, the file README.vos
becomes perlvos.pod, which is installed as a man page, which you can read
with `man perlvos`. The fun comes because various parts of the codebase want
a list of all the man pages, and these platform specific files (along with
various other things generated or copied) should be in that list. So you
can't get the complete list by globbing <pod/*.pod>, at which point do you
create a file listing everything, or just the exceptions to add to that
glob? Or something else? It's a trade off.

One place that some time ago I had noticed had a list of perl manpages was
the perl debugger. The intent is that you can save some typing by being able
to abbreviate "doc perlfunc" as "doc func" etc, omitting the "perl" prefix
from the man page name. It needs to know the correct name because on *nix
platforms it runs the system's man command to display the page. If `man
func` failed to run, and "func" was in the list, then it would try a second
time with `man perlfunc`. To implement this abbreviation feature, the
debugger had a hard-coded list of man page names. *Inevitably* this list was
out of date, missing more recently added man pages as well as having the odd
reference to a page now removed.

(I think it's fair to say of anything that if it's neither automatically
generated nor tested for up-to-dateness, then it runs the risk of becoming
stale, and for a large enough project and an old enough project, the
probabilities aren't in favour of fresh.)

Seeking a proper solution to the problem (which doesn't create an ongoing
requirement for human intervention), I had considered using some of the
existing code in regen/ to keep the list up to date. But this feels ugly, so
I held off. But recently I figured that one could completely eliminate the
list by looking at run time in the perl library tree for the installed Pod
files. If a pod file is found, it's a legitimate abbreviation. If the Pod
file isn't found, then it's assumed not to be a legitimate abbreviation, and
the second man command isn't attempted. If for whatever reason your
distribution chose not to install the Pod files, all that "breaks" is that
you have to type "doc perlfunc" each time. It seemed like the right trade
off.

So I started work. Only to find that in the shipping perl5db.pl the man
command had been inadvertently broken by a recent refactoring, and no-one
had spotted this. Obviously, I fixed this first, and wrote the simplest test
I could to try to ensure that the doc command didn't break again. Try to run
"man perlrules", and expect that to fail. And "it fails" isn't quite good
enough, as the test needs to be sure that it gets the "not found" error
message from running an external command, given that the regression it's
trying to protect against is failure to run that. So I wrote a minimal test
that causes the debugger to run `man perlrules` and check that it's man that
is failing.

Fail it does, but not always with the same message. I'd assumed that the
failure message was the same on all *nix systems. Tony Cook fixed the test
to skip on all platforms except Linux. Whilst only testing on one platform
isn't ideal, it's a common platform, so the tests are going to get run
enough, and it's better than no tests, which seemed to be the only
alternative. So Linux only should be safe, right?

If the question is in the title, you know that the answer is "no". In this
one also need to override the locale, in case man has been localised to
report translated error messages. Tony tested a fix on the failing
platforms, localising $ENV{LANG} and $ENV{LC_MESSAGES} to "C", and pushed
this to blead. Which worked on *those* platforms. But whack-a-mole wasn't
quite over, because LC_ALL overrides LC_MESSAGES, as demonstrated by
subsequent platform failures. So it took one more tweak to the test before
we had something reliable enough to have no failures. Everything takes
longer than you expect it to, because testing portably and robustly is hard,
and you usually don't get it exactly right first or even second time.


The final purge this month was the code to support DG/UX. DG/UX was a Unix
sold by Data General. The last release was in April 2001. It only runs on
Data General's own hardware.

It should have been removed before v5.18.0 but it got missed, and then
seemed an inappropriate thing to do during the code freeze. There is still a
lot of code for platforms we don't have the infrastructure to support. In
some cases we doubt that Perl even compiles natively. Specifically, if
you're using Perl on any of z/OS, Windows CE, NeXT, AmigaOS, DJGPP, NetWare
(natively), OS/2 or Plan 9, we'd really like help in getting Perl 5
buildable and regularly smoked. Otherwise it's not clear how long we can
keep considering those platforms as viable.

https://metacpan.org/module/RJBS/perl-5.18.0/pod/perldelta.pod#Future-Deprecations


A more detailed breakdown summarised from the weekly reports. In these:

16 hex digits refer to commits in http://perl5.git.perl.org/perl.git
RT #... is a bug in https://rt.perl.org/rt3/
CPAN #... is a bug in https://rt.cpan.org/Public/

[Hours]		[Activity]
  0.25		5.18.0 RCs
  1.00		APIs
  1.75		DG/UX
  4.75		Floating point stringification (RT #108378, RT #115800)
  0.25		HP-UX smoke
  5.50		HvFILL
  0.25		Jenkins
  1.25		PVHV (critbit)
  0.50		RC testing
  0.50		RT #113920
  0.25		RT #114878
  0.25		RT #116098
  0.25		RT #116407
  1.00		RT #116943
  1.00		RT #116989
  3.00		RT #117031
  0.25		RT #117209
  0.25		RT #117501
  0.50		RT #117793
  1.50		RT #117835
  0.25		RT #117885
  1.00		RT #117893
  0.75		RT #117903
  0.50		RT #117907
  0.25		RT #117917
  0.75		RT #117941
  0.50		RT #117967
  0.25		RT #117969
  0.25		RT #117977
 13.25		RT #118055
		RT #118055 sparc32 use64bitint
  0.25		RT #118121
  6.50		RT #118139 (Storable)
  3.00		RT #118157
  0.25		RT #118187
  1.00		RT #118197
  0.25		RT #118225
  0.25		RT #5087
  0.25		RT #63402
  0.75		RT #78674 (and the hazards of PUTBACK)
  0.25		SV_CHECK_THINKFIRST()
  0.25		SelfLoader (CPAN #85572)
  5.00		Storable
  2.25		TAP::Parser
		TAP::Parser (CPAN #84939)
		TAP::Parser (Win32 failure)
  2.75		Unicode Names
  0.50		a2p
  2.00		bisect.pl
  0.75		buildcustomize.pl
  0.50		diag.t/regen/embed
  1.00		fakethr.h
  8.25		makedepend.SH, x2p/Makefile.SH
  2.75		perl518delta
  2.00		perl5db
 28.50		pp_pack.c
		pp_pack.c (BYTEORDER)
		pp_pack.c (hton* and htov*)
  6.00		process, scalability, mentoring
 20.75		reading/responding to list mail
  1.00		smoke-me branches
  0.25		used only once
======
139.25 hours total

* If you weren't aware of it, GE in Canada need PDP-11 programmers for the
   next 27 years. That's more job security than most other firms offer.
   But assembler is the skill they care about, not Perl:
   http://www.vintage-computer.com/vcforum/showthread.php?37827-Greetings-from-GE-Canada

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About