develooper Front page | perl.perl5.porters | Postings from July 2013

NWCLARK TPF grant report #93

From:
Nicholas Clark
Date:
July 26, 2013 14:00
Subject:
NWCLARK TPF grant report #93
Message ID:
20130726140035.GL4940@plum.flirble.org
[Hours]		[Activity]
2013/06/10	Monday
 0.25		RT #118365
 1.50		dots
 4.25		reading/responding to list mail
=====
 6.00

2013/06/11	Tuesday
 0.25		RT #109744
 1.00		VMS
 0.50		dots
 7.50		miniperl Makefile bootstrap ordering
 0.75		reading/responding to list mail
=====
10.00

2013/06/12	Wednesday
 0.25		File::Spec XS
 1.25		Storable
 7.00		miniperl Makefile bootstrap ordering
=====
 8.50

2013/06/13	Thursday
 2.25		Storable
 2.25		VMS-Filespec and known_extensions
 1.75		miniperl Makefile bootstrap ordering
=====
 6.25

2013/06/14	Friday
 0.25		FindExt
 0.25		RT #114576
 0.25		RT #118195
 0.50		Storable (HP/UX)
 2.25		VMS-Filespec and known_extensions
 0.75		dots
 1.75		known_extensions, unbuilt non-XS extensions
 0.25		reading/responding to list mail
=====
 6.25

2013/06/15	Saturday
 1.75		known_extensions, unbuilt non-XS extensions
=====
 1.75

Which I calculate is 38.75 hours

I spotted a way to remove a few more tangles from the build, on *nix, VMS
and Win32. It's always fun having to juggle three different objects
together, and this was no exception.

The build has never depended on having Perl installed. Perl's portability
was able to scale to multiple architectures and OSes by

1) having the configuration system compile and *run* test programs to find
   out what works, and what needs to be worked around
2) bootstrapping as quickly as possible to a minimally working perl and then
   writing as much of the rest of the build infrastructure once, in Perl.

Attempting to adapt that to also permit cross-compiling is hard, which is
why it hasn't happened (yet). But all our build tools cross compile nicely.
(On *nix, that would be sh, sed, awk, grep, make, cc.) Hence one can
bootstrap Perl 5 onto a new platform, albeit in a rather round about way, by
first bootstrapping a native toolchain.
 
The various platform Makefiles contain the logic to try to get from some C
source to "working miniperl" as rapidly as possible. Part of the fun is that
a lot of the modules that are needed to "work" are actually dual life, hence
are shipped in dist/ or cpan/, and some modules, most importantly Config,
need to be generated from the platform specific build files. Additionally,
the build needs to be able to run in parallel*, which means that

1) it's beneficial to split build tasks as small as possible to maximise
   concurrency
2) it's necessary for every task to know its pre-requisites, so that make
   won't accidentally run a rule before something it depended on gets built

(or, how this actually manifests - the build fails some of the time due to a
race condition caused by a missing dependency, and it's very hard to
recreate and track down.)

Hence the build rules for things early in the build ended up being quite
tightly coupled to everything else early in the build, because as soon as
one changes where a file is located, or how it is built, all its explicit
and implicit dependencies have to be updated.

One particularly "big" dependency (because it is very early) is the file
lib/build_customize.pl. This is a key part of enabling the build to work at
all. If "$INC[0]/build_customize.pl" exists, then it's loaded by miniperl.
The trick is that lib/build_customize.pl sets @INC to the absolute paths of
all the toolchain modules in ext/, dist/ and cpan/, so that the toolchain
can be shipped in an easy to maintain layout, but is capable of being loaded
to install each module into lib/ without first being in lib/ In turn,
lib/build_customize.pl is written by write_buildcustomize.pl using the
pure-Perl code in Cwd, building on the existing cross-platform nature of the
Perl code to avoid having to produce 3 (or more) platform specific ways of
converting directories to absolute paths.

Once lib/build_customize.pl is in place, just running `./miniperl -Ilib` is
enough to make the otherwise unbuilt distribution behave enough like a
"normal" *installed* perl that the rest of the build system doesn't need to
set up anything special. The upshot of all this is that there's one small
piece of code which works everywhere (win for the Perl build scripts), but
every rule in the Makefile (and the Win32 Makefiles, and DESCRIP.MMK) needs
to ensure that it exists.

What I realised was that by removing one little bit of concurrency it would
be possible to simplify quite a lot of the other rules. Not just the direct
simplification of only having one dependency, but also a more subtle
simplification - once lib/build_customize.pl is in place, then Cwd is in
@INC (being one of the toolchain modules that write_buildcustomize.pl
locates) hence various other rules which previously had miniperl invoked
with multiple -I options to ensure that the pure-Perl Cwd could be loaded
from dist/ could now have all those extra -I options eliminated, as -Ilib
does it all once lib/build_customize.pl exists.

Specifically, by combining the rule that links miniperl with the rule to
generate lib/build_customize.pl, all this simplification would fall out.
And, somewhat perversely, it's actually conceptually simpler to have the rule
"officially" be for lib/build_customize.pl, with the miniperl rule depending
on it, than the other way round, as this means that the rest of the
Makefile(s) can depend on miniperl, which is much simpler to skim.

Of course, all this is only obvious in hindsight, and inevitably the devil
is in the detail when it comes to actually getting it to work, and work
reliably (more of that next week).


While removing the dependencies on [.lib]build_customize.pl from the the VMS
makefile I noticed that for VMS there was a second dependency that featured
heavily - [.lib.VMS]Filespec.pm - thanks to a requirement to copy it from
[.vms.ext] before it could be used. And, bonus, more code to copy its test
to [.t.lib]. All this was special case code, which could be completely
eliminated if both files could be moved into a regular extension in the
directory ext/VMS-Filespec, similar to ext/VMS-DCLsym and ext/VMS-Stdio, and
like them only built on VMS. The only thing added would be one line in
write_buildcustomize.pl to add ext/VMS-Filespec/lib to the toolchain @INC.

Of course, all this should be simple. But if it were simple, how come
VMS::Filespec isn't already in ext/? After all, VMS::DCLsym and VMS::Stdio
were both previously in vms/ext/, so how come all three weren't moved at the
same time? After all, *nix and Win32 already know to not try to build or
test VMS::DCLsym and VMS::Stdio, so why not add a third?

The answer (as ever) turns out to be another yak that needs shaving.
VMS::DCLsym and VMS::Stdio are XS modules. The build and test infrastructure
is quite capable of skipping XS modules. It has to be, because not all XS
modules can be built everywhere. But for various reasons, none of which were
really designed, it's not capable of not building a pure-perl module. I was
aware of this already, but now I had a real use case that it was preventing
me from implementing, it was irritating enough that I had reason to fix it.
Of course, it wasn't a small job, and consumed a good chunk of the next week
too...

Nicholas Clark

* Being able to run the build in parallel is cheaper than increasing the
  number of hours in the day. Although I'm sure if you ask nicely on the
  Internet, someone will offer to take money from you to implement the
  latter solution. :-)



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About