develooper Front page | perl.perl5.porters | Postings from September 2012

NWCLARK TPF grant report #50

Nicholas Clark
September 7, 2012 13:04
NWCLARK TPF grant report #50
Message ID:
[Hours]		[Activity]
2012/08/13	Monday
 6.00		newCONSTSUB
 1.00		reading/responding to list mail
 2.50		readonly ops

2012/08/14	Tuesday
 0.50		ext/B/t/optree_misc.t
 3.00		reading/responding to list mail
 3.25		readonly ops

2012/08/15	Wednesday
 1.50		CPAN #78624
 0.25		bootstrapping Will Braswell
 0.25		cross compiling
 0.25		ext/B/t/optree_misc.t
 1.00		jemalloc
 4.25		reading/responding to list mail

2012/08/16	Thursday
 0.50		bootstrapping Will Braswell
 2.50		optimising sub entry
 1.75		reading/responding to list mail

2012/08/17	Friday
 0.25		IO-Socket-IP
 0.50		Remove support for UTS Global.
 0.25		bootstrapping Will Braswell
 1.25		dl_aix.xs
 2.50		microperl
 0.25		newCONSTSUB
 0.25		optimising sub entry
 0.75		process, scalability, mentoring
 0.75		reading/responding to list mail
 0.50		smoke-me/dynaloader_silence_xs_warning

2012/08/18	Saturday
 0.75		bootstrapping Will Braswell
 0.50		microperl

2012/08/19	Sunday
 0.50		smartmatch

Which I calculate is 37.50 hours

Of the "three related areas" mentioned in the previous week's report, I only
managed to finish "readonly ops" this week. So here goes:

ithreads is implemented on the interpreter cloning infrastructure originally
added to provide fork emulation on Win32. Part of the design for that is
that under ithreads optrees are read only and shared between threads, to
save the time and memory that would be needed to copy them. For building
without ithreads, the old rules still hold - there is no restriction that
OPs should be read only, and no restriction as to what they can point
to. However, to implement the shared OPs for ithreads required locating all
places where OPs have mutable fields or pointers to structures that are now
per-ithread, and change the code so that when building under ithreads they
move to unshared structures, or otherwise ensure that the OP stays read only
once constructed. To my memory no bugs had cropped up post v5.6.0 relating
to this, so it was assumed that all was fine.

In 2007 I decided to check this assumption by adding the ability to
recompile perl with the OP memory allocations coming from mmap(), and using
mprotect() to turn OPs read only once they had been built. I forget what
even motivated me to do this, but the approach did find a couple more
obscure cases where OPs were being modified at runtime, in violation of the
ithreads rules.

Father Chrysostomos recently refactored OP allocation to always use a slab
allocator, as part of fixing some very long standing bugs to do with OPs
leaking if compilation fails within an eval, and did some further work on
it. Because I was having "fun" trying to work out how Perl_newCONSTSUB(),
PL_curcop and various other things were interacting in reporting warning
filenames and line numbers, I decided to compile with
-DPERL_DEBUG_READONLY_OPS to see if enabling that code would shed any light
on the problem. As a matter of routine, I did this by doing a full build and
test (less than 5 minutes in parallel on reasonable hardware), and noticed
that nearly all of the tests passed in this configuration. So I set off
identifying the cause of failures, to see if it was possible to get it to

It turns out that it was (at least on the x86_64 Linux system I was testing
on), as there were only two underlying causes of failures. Firstly
pp_i_modulo contains runtime code to detect a bug in glibc 2.2.5's _moddi3,
switching in a slower work around implementation if the C library is
buggy. I think that the reasoning for doing this check at runtime, rather
than compile time, is because one is (typically) linking against a shared
library here, and so detecting the problem at build time is potentially
useless - if the system is upgraded to the buggy version, your build time
information that you were safe is now stale, and bugs appear. Meanwhile if
you build when the installed vversion is buggy, but it's then upgraded to a
fixed version, you don't get the benefit. So when built on platform that is
"at risk", the code does a check on the first call to pp_i_modulo, and then
picks the "right" implementation and rewrites the op to call that directly
in future. "rewrite" - that's a SEGVing offence on a read-only page. So the
simple solution was to disable all the runtime probing if
PERL_DEBUG_READONLY_OPS is defined, effectively treating glibc like every
other platform.

The only other write action on OPs was the debugger setting breakpoints.
When the debugger is enabled, all NEXTSTATE ops are changed at compile time
in DBSTATE ops, and if OPf_SPECIAL is set on a DBSTATE op then a callback is
made into the debugger. Clearly setting or clearing OPf_SPECIAL on an OP at
runtime is a write activity. Given that the debugger itself is aware of
threads, and it is documented that setting a breakpoint applies to all
threads, I decided that the right solution was to explicitly permit this OP
writing, by tweaking the C code to set the OP read/write before altering the
flag, and back to read only afterwards.

With these changes, building with -DPERL_DEBUG_READONLY_OPS (and
-Dusethreads, obviously) passes all tests.

I also investigated "microperl". "microperl", like "miniperl", is somewhat
a misnomer. It's not that much smaller:

-rwxr-xr-x 1 nick nick 1091074 Aug 16 21:55 microperl
-rwxr-xr-x 1 nick nick 1223695 Aug 16 15:15 miniperl
-rwxr-xr-x 1 nick nick 1332163 Aug 16 16:03 perl

So what are the differences?

perl is (hopefully obviously) the thing that you want to install. It's
linked with the platform specific dynamic library loading code which
implements DynaLoader, and hence enables perl to load compiled XS code at
runtime. But that dynamic library loading code is written in XS, so needs
a copy of perl to build it. But the build system can't assume that there's
a copy of perl on the system to run this, so how does it bootstrap?

That's the job of miniperl. miniperl is a binary linked from (pretty much)
all the same object files as go up to make perl, but not DynaLoader.o
It's good enough to run xsubpp, the XS to C translator (and the rest of
the build system), but none of the things it needs need perl to build them.
(Because we ship the small number of generated files that need perl to
be recreated, and now have a regression test to ensure that they're kept
up to date). So "perl" is pretty much "miniperl" + DynaLoader.

So where does microperl come in? It's not specifically intended to be
"tiny".  My understanding is that microperl was intended as an experiment as
to whether it's possible to build perl without needing to run some other
tool first to configure it. If you could, you might be able to replace
Configure with some sort of bootstrapping approach using a microperl to
build the configuration for the real perl. That sounds useful. But work on
it pretty much stopped over a decade ago.

Even the "no configuration" idea doesn't really work - you need at least one
canned configuration for ILP32 systems, and one for LP64 systems. (And,
possibly, a third for LLP64 systems, which may just be Win64)

Because microperl doesn't probe features, and builds off a canned,
that has to assume that pretty much everything optional isn't
present. Meaning that if one happens to take the microperl config and graft
it into a regular build, add -DNO_MATHOMS to remove all the legacy support
wrappers, and bodge a couple of things that I can't configure away (yet), I
find that I can get regular perl pretty close to the size of microperl:

-rwxr-xr-x 1 nicholas p5p 1290000 Aug 17 15:17 microperl
-rwxr-xr-x 1 nicholas p5p 1293153 Aug 17 15:09 miniperl
-rwxr-xr-x 1 nicholas p5p 1387582 Aug 17 15:09 perl

microperl is not much different in size from miniperl.

(I don't know why perl is 94429 bytes bigger than miniperl, as DynaLoader.o
is only 9600 bytes, and the other 3 object files that differ between them
only are about 10K larger in total)

Which means that all the special-casing with -DPERL_MICRO and the various
special config files and Makefile *don't* actually gain anything meaningful
in size reduction.

As I also can't see anyone looking to replace Configure at this stage in
Perl 5's lifecycle, it's not clear to me that there's any actual case for it.

Given that we've managed to break microperl in two stable releases in the
past 3 years without anyone noticing until some time afterwards, and it
costs us time and effort to maintain it, I'm proposing that we announce in
5.18.0 that we're planning to eliminate it, and if no-one gives a good use
case as to why to keep it, we cull it before 5.20.0 ships.

And still on the subject of removing things that are no longer used, this
week I removed code relating to UTS. UTS was a mainframe version of System V
created by Amdahl, subsequently sold to UTS Global. The port has not been
touched since before 5.8.0, and UTS Global is now defunct.

 MANIFEST                      |    5 -
 Porting/ |    2 +-
 README.uts                    |  107 ----------------------
 ext/POSIX/hints/        |    9 --
 handy.h                       |    2 +-
 hints/                  |   32 -------
 perl.h                        |   23 +----
 plan9/mkfile                  |    2 +-
 pod/perl.pod                  |    1 -
 pod/perl58delta.pod           |    4 +-
 pod/perldelta.pod             |    8 +-
 util.c                        |    3 -
 uts/sprintf_wrap.c            |  196 -----------------------------------------
 uts/strtol_wrap.c             |  174 ------------------------------------
 win32/Makefile                |    5 +-
 win32/             |    5 +-
 x2p/a2p.h                     |    8 --
 17 files changed, 17 insertions(+), 569 deletions(-)

"Every little helps", as a certain supermarket round here likes to put it.*

Nicholas Clark

* And round quite a few places, as it's the worlds third largest retailer. Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About