develooper Front page | perl.perl5.porters | Postings from October 2013

NWCLARK TPF grant report #96

Nicholas Clark
October 3, 2013 13:25
NWCLARK TPF grant report #96
Message ID:
[Hours]		[Activity]
2013/07/01	Monday
 1.25		RT #118543
 0.25		smoke-me branches
 0.50		smoke-me/blead
 0.25		smoke-me/blead (CRLF)
 0.25		smoke-me/trim-superfluous-Makefile

2013/07/02	Tuesday
 1.50		Exporter -> dist/Exporter
 1.50		File::Find -> ext/
 0.50		File::Spec XS
 1.25		FindExt, VMS::FileSpec
 0.50		PL_op_exec_cnt cleanup
 3.75		VMS-Filespec and known_extensions
 0.50		postfix dereference
 0.50		smoke-me/quieten-readonly-ops
 0.50		smoke-me/sys_intern
 0.25		sub {}

2013/07/03	Wednesday
 2.50		File::Find -> ext/
 5.25		utils/Makefile.SH

2013/07/04	Thursday
 1.00		Compiled-in POSIX character class inversion lists
 2.25		ExtUtils::Miniperl
 0.25		smoke-me branches

2013/07/05	Friday
 1.75		ExtUtils::Miniperl

2013/07/06	Saturday
 1.75		ExtUtils::Miniperl

Which I calculate is 28.00 hours

The theme for this week seems to be "clean up parts of the build".

perl was first developed on a Unix system, back in the times when there
dozens of different Unix variants. Hence portability was initially across
the different C libraries (and even C compilers - remember, Perl predates
the C89 standard, let alone its implementation). Figuring out precisely what
the system could(n't) do (correctly) was performed by a shell script,
Configure, and it expanded its variables into files needed for the build by
using simple shell scripts as templates.

It's still how configuration is done by perl 5, and "figure things out with
a shell script" is the standard way that pretty much all open source
software builds on *nix systems. There's an awful lot of useful hard-won
knowledge distilled into these shell-based configuration systems, and whilst
they aren't that pretty, they work, which to the end user is what matters.

The template expansion is necessary because to maintain sanity, it's
necessary to maintain a clean distinction between files that are generated
as part of the configuration and build, and files that shipped with the
package (from a tarball, or from a version control system). There's
implicitly a concept of who "owns" the file - the human using a text editor,
or the build system.

Makefiles are one of the types of files whose contents need to vary based on
things determined by Configure. Hence, it naturally falls out from the above
description that the various subdirectories have Makefiles which are
generated by templates implemented as shell scripts. This plan doesn't fare
so well now that "portability" no longer just means to different *nix
systems, but also to systems that don't even have shells.

A particular mess was the Makefile for the utils/ directory. Like the other
Makefiles it had been being generated by running a Makefile.SH. However,
this doesn't work on Win32, so the solution was to *also* check the
generated Makefile into the repository. Just to add to the fun, here VMS is
special, and has rules in its top-level Makefile (DESCRIP.MMS) to generate
files in utils/, and completely ignores utils/Makefile.

This particular Makefile has very few configuration controlled changes, so
typically the regenerated version was identical to the checked in version.
However, for some configurations it would differ, with the result that
configure && make of a clean tree would result in a dirty tree, with
(seeming) unchecked in changes. This is wrong - building should never make a
clean tree dirty.

The irony is that the Makefile generation doesn't need to be a shell script.
By the time that utils/Makefile is needed, there is already a fully
serviceable miniperl ready and waiting.

So I bit the bullet and replaced utils/Makefile.SH with utils/Makefile.PL
Carefully. Firstly by replacing the shell script with a perl script that
generates a byte-for-byte identical utils/Makefile (including boilerplate
that claims it was generated by utils/Makefile.SH), which was called from
the same place that the *nix Makefile had been calling Makefile.SH

With that working on *nix, I then added rules to the Win32 Makefiles to call
utils/Makefile.PL to generate utils/Makefile, and rules to both them and the
*nix Makefile to delete the generated file as part of the cleanup targets.
With that done, I could check in a commit that deleted utils/Makefile from
git, and we no longer had a file in the repository being overwritten as part
of the build.

The other mess I tied up this week was ExtUtils::Miniperl. If you're
building an XS extension and want to link it statically with perl,
ExtUtils::MakeMaker achieves this by writing out a new perlmain.c for you,
compiling that, and linking it with the your extension and libperl.a (from
the installed tree). It generates that new perlmain.c using the module
ExtUtils::Miniperl. It's also the case that perlmain.c as used by the core
tree to build perl itself is generated by ExtUtils::Miniperl (well, since
commit fbcaf61123069fe4 in Nov 2010, when I changed the *nix build to use
the module, and eliminated the shell script that used to do it - spot a
pattern here?).

However, how ExtUtils::Miniperl *used* to get built was itself an exercise
in layered yaks. ExtUtils::Miniperl is a Perl module to generate C code. It
used to be generated at build time by a script, which would read
the file miniperlmain.c (checked into git - the file containing the main()
used by miniperl), scan the C code for known markers, and then write out the
module. So we had Perl code scanning C code to generate Perl code that will
generate C code.

This is obviously more complex than it needs to be. But what's the best way
to untangle it? Ideally we'd also write out miniperlmain.c using
ExtUtils::Miniperl, to avoid needing to have code that scans it (and fragile
marker comments). The problem here is that miniperlmain.c is needed to build
miniperl, which is needed to run ExtUtils::Miniperl to generate
minperlmain.c, so there's a circular dependency. So that's not looking good.

Turns out that there is a simple solution, by exploiting a different feature
of the distribution. There are some files in the build needed to build
miniperl which are generated by Perl code. These are not updated
automatically (they can't be - otherwise you end up with circular build
dependencies, which make abhors, and can often confuse a parallel make
sufficient to make it fork bomb), so they are really only suitable for files
that change rarely. It also used to be the case that the checked in version
could get out of sync with the code and data used to generate it. (Humans
are involved, so forgetfulness happens - we've fixed this now with a
regression test that spots it.) So by codifying miniperlmain.c as a file
regenerated by a small script using ExtUtils::Miniperl, it was possible to
eliminate the scanning code and an entire layer of code that generates code.

Nicholas Clark Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About