develooper Front page | perl.perl5.porters | Postings from July 2013

NWCLARK TPF grant report #88

From:
Nicholas Clark
Date:
July 26, 2013 13:42
Subject:
NWCLARK TPF grant report #88
Message ID:
20130726134241.GG4940@plum.flirble.org
[Hours]		[Activity]
2013/05/06	Monday
 0.50		RT #117031
 0.25		RT #117835
 9.00		pp_pack.c (hton* and htov*)
=====
 9.75

2013/05/07	Tuesday
 8.75		pp_pack.c (hton* and htov*)
=====
 8.75

2013/05/08	Wednesday
 0.25		Jenkins
 0.25		RT #114878
 0.25		RT #117893
 0.75		RT #117903
 0.50		RT #117907
 0.75		RT #78674 (and the hazards of PUTBACK)
 2.75		perl518delta
 0.50		process, scalability, mentoring
 3.00		reading/responding to list mail
=====
 9.00

2013/05/09	Thursday
 0.25		RT #116407
 0.25		RT #117501
 0.25		RT #117885
 0.75		RT #117893
 0.25		RT #117917
 0.25		SV_CHECK_THINKFIRST()
=====
 2.00

2013/05/10	Friday
 0.75		RT #117941
=====
 0.75

2013/05/11	Saturday
 1.00		APIs
 0.25		process, scalability, mentoring
 0.25		reading/responding to list mail
=====
 1.50

2013/05/12	Sunday
 2.00		Floating point stringification (RT #108378, RT #115800)
 0.25		RT #117969
=====
 2.25

Which I calculate is 34.00 hours

This week I simplified part of the implementation of pack and unpack,
removing about 130 lines of code, and reducing the object code size by about
2K. The only casualty was support for mixed-endian systems. Sorry, PDP-11
users.

In the medal stakes for "risk to sanity", the implementation of pack and
unpack are strong contestants, although *only* for bronze. (They're fighting
it out with the implementation of sprintf. Compared with sprintf, they have
the advantage that they've been seen as sufficiently dangerous to nearby
code that they've been quarantined into their own file. The regular
expression engine gets silver, and the reigning champion and gold medal
holder remains the parser itself.)

pack/unpack is one of those pieces of code which has grown organically as
features were added, and so has become too large to easily comprehend the
big picture. In particular, it's hard to look at it at more than one level
at once, and hence see where it could be profitably refactored.

The main code of pp_pack and pp_unpack are large switch statements, which
are used to dispatch a single template code, and perform the correct sized
and typed read or write of bytes to or from the string. There seems to be a
lot of repetition in there, but it's more at a "template" level than a code
level, because the types of the variables in each case differ. Pretty much
all the things that are the same size and could be merged have been
merged. There's no obvious way to squeeze it further with this design.

However, one thing did strike me as odd - a pattern of repeated macros in
the unpack code, particularly in all the cases that dealt with the various
sizes and signedness of integers. They exist due to the order that the code
was written and then enhanced. The original code (dating from Perl 3 era)
read numbers as chunks of memory. It copied bytes to/from the (possibly
unaligned) input string to a local variable of the correct type. Much much
later Marcus Holland-Moritz added code to provide the pack modifiers "<" and
">" (for endianness) and implemented the endian swapping they require, by
endian swapping the values in the integer variables. Subsequently Ton Hospel
added code to cope with the unpack input string having been internally
transcoded to UTF-8, by having the copy transcode back if needed. These two
later additions are respectively the second and first macros.

So there was one macro to read a number into a variable, and then a second
macro to (optionally) spin the bits to re-order for a different endianness.
This looked more complex than it might be. On digging further I realised
that it was more subtle than it first looked - the code didn't just support
the obvious big and little endian byte orders, but also mixed endian.
(Technically, actually, I think it could support any arbitrary endianness.)

Big endian and little endian are familiar - on a big endian system the 32
bit number 0x04030201 would be stored in memory as the bytes 0x04 0x03,
0x02, 0x01; on a little endian system as 0x01, 0x02, 0x03, 0x04.
Mixed endian is quirky - 0x03, 0x04, 0x01, 0x02

We haven't had many bug reports from PDP-11 users since, hmmm, actually
since 5.000 was shipped, which made me wonder just how much of the code
still worked, and if anyone at all would notice**.

So I went digging. Mixed-endian support first was first added in Perl 3, as
part of adding 'N' and 'n' templates to pack. Fallback functions were
provided in util.c in case the system didn't define htonl() and ntohl().

The fallback functions contain conditionally compiled special-case code for
little endian systems, and a fallback loop for other values of BYTEORDER.
The intent is that the the fallback loop will work on any byteorder. In
fact, it only works correctly on 32 bit little endian systems, because it
*always* swaps the order of the bytes within the word. So irony is that the
special-purpose little endian code is actually correct for all platforms,
and the intended-to-be generic code only works on little endian.

Hence I infer that they were never needed and hence never used used, because
if they had been, regression tests would have failed. Moreover, it seems
that no-one ever even needed to link with them, because as is noted in "perl
4.0 patch 19" (commit 988174c19bcf26f6, Nov 1991):

+/*
+ * I think my_swap(), htonl() and ntohl() have never been used.
+ * perl.h contains last-chance references to my_swap(), my_htonl()
+ * and my_ntohl().  I presume these are the intended functions;
+ * but htonl() and ntohl() have the wrong names.  There are no
+ * functions my_htonl() and my_ntohl() defined anywhere.
+ * -DWS
+ */

(DWS appears to be "David W. Sanderson", who provided code for 'V' and 'v')


Because I wasn't confident about my code inspection, and it all looked too
unlikely to be actually true, I tested some of this on an emulator*. I don't
know how much a PDP-11 cost in its day, but it's amusing that you can
emulate it on Raspberry Pi. (Which is a lot smaller, uses considerably less
power, and costs $25).

The nice thing about Unix is that an ancient BSD is familiar enough to be
workable. The not so nice thing is that the C compiler misses a few things -
it only understands K&R argument style, doesn't do function prototypes, and
it's missing some *printf formats such as "%lX" and "%hx". (Which matter,
because int is 16 bits, and so there are a whole lot more size errors you
can't get away with for varargs functions such as printf.) There's no way
that that toolchain is going to build a perl from this decade, and I'm not
even sure if a viable C89 toolchain exists. So I didn't spend any more time
looking at it. (And it wouldn't have been trivial, as my "connectivity" with
the machine was typing and copy/paste on the logged in console. Getting
emulated networking set up would require root on the host machine figuring
out how to set up a bunch of IP-level games. This is a lot more complex than
running apt-get to install the emulator and then downloading the relevant
disk image for NetBSD.)


I made the executive decision that all the mixed endian support code can go.
Patches welcome to restore it, but we're only really interested if you have
a genuine reason to keep supporting PDP-11s going forward, and are in a
position to run a smoker.


With it gone, the simplification becomes tractable. Endianness is now either
"big endian" or "little endian", and the problem of "which endian?" reduces
to either "no-op" or "swap all the bytes". At which point, it's possible to
implement the endian-swap as part of the copy/transcode fixup, because the
number of bytes wanted is fixed and known, and the transformation is always
"reverse". So if endian swapping is needed, simply process those bytes from
the input in reverse order. The pair of macros (actually, an entire brace of
macros) simplify massively, to an inline call to memcpy() it if it's all
fine, and a call to a fixup function if it is not. This removes a lot of
inline code related to endian swapping, and results in the 2K object code
shrinkage. The common use cases remain inline, so the code shouldn't be any
slower (although I haven't tried to measure this).

Nicholas Clark

* If you weren't aware of it, GE in Canada need PDP-11 programmers for the
   next 27 years. That's more job security than most other firms offer.
   But assembler is the skill they care about, not Perl:
   http://www.vintage-computer.com/vcforum/showthread.php?37827-Greetings-from-GE-Canada

** Mixed endian demonstration, using the SimH emulator to emulate a PDP-11:

2.11 BSD UNIX (zeke) (console)

login: nick
Last login: Tue Aug 22 21:14:10 on console
2.11 BSD UNIX #14: Sun Oct 30 00:06:08 PDT 2005
Tue Aug 22 21:17:10 PDT 2006
nick[1] cat try.c
#include <stdio.h>
typedef long UV;
int main()
{
        int i;
        union {
                UV l;
                char c[4];
        } u;

        if (4 > 4)
                u.l = (((UV)0x08070605) << 32) | (UV)0x04030201;
        else
                u.l = (UV)0x04030201;
        i = 0;
        while (i < 4) {
        printf("%c", u.c[i]+'0');
        ++i;
}
        printf("\n");
        exit(0);
}
nick[2] ./try
3412
nick[3] uname -a
BSD zeke 2.11 2.11 BSD UNIX #14: Sun Oct 30 00:06:08 PDT 2005     root@zeke:/usr/src/sys/ZEKE  pdp11



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About