develooper Front page | perl.perl5.porters | Postings from July 2013

NWCLARK TPF grant report #92

From:
Nicholas Clark
Date:
July 26, 2013 13:57
Subject:
NWCLARK TPF grant report #92
Message ID:
20130726135713.GK4940@plum.flirble.org
[Hours]		[Activity]
2013/06/03	Monday
 6.50		RT #118283
 0.50		process, scalability, mentoring
=====
 7.00

2013/06/04	Tuesday
 2.25		RE_TRACK_PATTERN_OFFSETS/parse_start
 0.50		RT #118283
 2.50		Storable
 0.50		a2p
 0.25		process, scalability, mentoring
 0.50		reading/responding to list mail
=====
 6.50

2013/06/05	Wednesday
 1.25		RE_TRACK_PATTERN_OFFSETS/parse_start
 0.25		RT #118283
 1.00		Win32/FindExt
 2.00		process, scalability, mentoring
 0.25		reading/responding to list mail
 0.50		static extensions
=====
 5.25

2013/06/06	Thursday
 0.25		RE_TRACK_PATTERN_OFFSETS/parse_start
 1.50		Win32 & i_rpcsvcdbm
 1.25		Win32/FindExt
 0.25		lib/perlmodlib.PL
 0.50		reading/responding to list mail
 0.50		smoke-me branches
=====
 4.25

2013/06/07	Friday
 0.25		RT #118175
 0.25		RT #118365
 5.25		reading/responding to list mail
 0.25		static build on Win32
=====
 6.00

Which I calculate is 29.00 hours

"The nice thing about standards is that you have so many to choose from"
(Andrew S. Tanenbaum). I guess the same can be said about build systems.

So the structural intent of the build is

1) Permit the user to choose configuration options
2) Build the package
   (which may take some time, and shouldn't need user intervention)
3) Test the package, and collate all test results into one report at the end
   (an excuse for a second tea break)
4) Install the package
   (which probably runs with elevated privileges)

As well as trying to avoid a long period where a human needs to babysit the
build in case it stops to ask a question, this approach also has the benefit
that you find out by the end of configuration what extensions the build
stage should be producing. Or, more importantly (compared with at least one
other similar language), you don't need to wait until the end of the build
run to discover that an extension you really needed isn't built, and then
have to iterate the entire configure & build steps until you figure out the
correct form of rubber chicken sacrifice to make it all work.

Of course, the problem is that for step 1 one can't assume you have a copy
of Perl already (because how did it get built?) so the configuration system
has to run using native tools. And the more platforms the package is ported
to, the more variations of native tools you have.

So, on *nix and VMS, where the OS, architecture and even the make utility
will vary, the configuration script figures out which extensions are shipped
by scanning the file system, because even the Makefile has to be
programmatically generated to cope with platform quirks. On Win32 variations
are a lot less, so it's viable to ship a pair of Makefiles which between
them cover all the common make variants. Hence on Win32 configuration is
implemented by changing options in the appropriate Makefile, and the build
determines which extensions are wanted by combining those options with a
scan done by the (uninstalled) FindExt module.

So that's a Perl module right? Which means that we can test it in a
platform-independent way. Which turned out to be useful back in 2009 when I
was working out how to move modules to cpan/ dist/ and ext/ as part of the
big rearranging to make dual life a lot simpler, as I could mostly verify
that my changes were going to work on Win32 without having any direct access
to a Win32 system to test it. The tests written for that purpose were robust
enough that they were moved to t/porting and run as standard, which verifies
that the logic in FindExt is consistent with that of Configure.

However we weren't able to test everything. We couldn't correctly test the
list of static extensions due to various problems, and list of dynamically
built extensions failed match due to 2 discrepancies between Configure logic
and FindExt.

Firstly, due to a typo in checking defines in %Config::Config, FindExt
thought that I18N::Langinfo would never be built (whereas it is built on
most *nix systems). So I fixed that, and everything now passed on *nix.
However, the test still failed on Win32, thanks to a problem that was a bit
more convoluted. In replicating Configure's logic, FindExt thought that
ODBM_File *would* be built on Win32, because win32 canned configs had
i_rpcsvcdbm set to define. What on Earth is i_rpcsvcdbm?

	This variable conditionally defines the I_RPCSVC_DBM symbol, which
	indicates to the C program that <rpcsvc/dbm.h> exists and should
	be included.  Some System V systems might need this instead of <dbm.h>.

Eh? Win32 is most definitely not an ancient System V Unix, and won't repeat
the same old quirks (it has brave new quirks instead). It turned out that
FindExt was quite correct, and the canned configs (and header files) had
been wrong since 1997. The problem hadn't been spotted because the Win32
configuration explicitly says not to build ODBM_File. Now it's correct.
Combine all this with fixes by (at least) Steve Hay and Tony Cook, and it's
now possible to test that FindExt and Configure agree on which extensions
are to be built, and which are dynamically linked, which are statically
linked, and which are non-XS. While these changes of low utility themselves,
all this would prove useful to unravelling more of the build complexity in
the next weeks.


I also found a small but representative example of how the best of
intentions don't always produce the best solution to a problem, actually
increasing clutter.

a2p, the awk to Perl converter, is written in C. It dates from perl 1 time,
so two years before the first ANSI C standard, and like perl 1 it started
with the then classic 3-argument main() function:

    main(argc,argv,env)
    register int argc;
    register char **argv;
    register char **env;
    {

K&R style was converted to ANSI style with commit f0f333f455368029 back in
1997 and it had stayed fundamentally the same ever since, although the
register declarations have been removed, and const added. The perl
interpreter's main() function has evolved in the same way.

Hence in 2005, when Jarkko cranked up the strictness on the Tru64 compiler,
and fixed all issues that it warned about, he added the relevant pragma to
both perl and a2p to stop the compiler warning about the non (ANSI) standard
third parameter. Seems sane.

What no-one noticed was that unlike perl, a2p's main() doesn't actually
*use* the env parameter, so a better solution is to remove it. Which means
that the pragma can be removed too. So that's 4 lines gone, and 1 line
simplified.

Each of these sort of things on their own isn't really a problem, and really
aren't a priority to find, let alone fix. But there are potentially many
things which could be terser, tidier and clearer, and the sum of all the
little bits of suboptimal verbosity mounts up, making the core's code harder
for everyone to follow. Hence it seems sane to tackle them as and when they
are found, if there's an obvious simple safe fix.

Nicholas Clark



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About