develooper Front page | perl.perl5.porters | Postings from May 2013

[perl #117887] generic byteorder code in my_htonl and my_ntohl is incorrect

From:
Nicholas Clark
Date:
May 6, 2013 11:32
Subject:
[perl #117887] generic byteorder code in my_htonl and my_ntohl is incorrect
Message ID:
rt-3.6.HEAD-28177-1367839929-257.117887-75-0@perl.org
# New Ticket Created by  Nicholas Clark 
# Please include the string:  [perl #117887]
# in the subject line of all future correspondence about this issue. 
# <URL: https://rt.perl.org:443/rt3/Ticket/Display.html?id=117887 >


The fallback functions my_htonl and my_ntoh_l were added to util.c for
Perl 3.0, as part of adding 'N' and 'n' templates to pack. They contain
conditionally compiled special-case code for little endian systems, and
a fallback loop for other values of BYTEORDER. The intent is that the
the fallback code will work on any byteorder. In fact, it only works
correctly on 32 bit little endian systems, because it *always* swaps the
order of the bytes within the word. The relevant code in util.c in blead
looks like this:

long
Perl_my_htonl(pTHX_ long l)
{
    union {
	long result;
	char c[sizeof(long)];
    } u;

#if BYTEORDER == 0x1234 || BYTEORDER == 0x12345678
#if BYTEORDER == 0x12345678
    u.result = 0; 
#endif 
    u.c[0] = (l >> 24) & 255;
    u.c[1] = (l >> 16) & 255;
    u.c[2] = (l >> 8) & 255;
    u.c[3] = l & 255;
    return u.result;
#else
#if ((BYTEORDER - 0x1111) & 0x444) || !(BYTEORDER & 0xf)
    Perl_croak(aTHX_ "Unknown BYTEORDER\n");
#else
    I32 o;
    I32 s;

    for (o = BYTEORDER - 0x1111, s = 0; s < (sizeof(long)*8); o >>= 4, s += 8) {
	u.c[o & 0xf] = (l >> s) & 255;
    }
    return u.result;
#endif
#endif
}


Extracted into a standalone file (byteorder.c), with the first #if changed
to permit code to be forced to use either implementation, with output is
generated by:

int
main() {
    unsigned long in = 0x0A0B0C0D;
    unsigned long mid = Perl_my_htonl(in);
    printf("my: %08lx %08lx %08lx\n", in, mid, (unsigned long) Perl_my_ntohl(mid));
    mid = htonl(in);
    printf("    %08lx %08lx %08lx\n", in, mid, ntohl(mid));
    return 0;
}


Tested on 32 bit little endian systems, first with the special-case code,
then forced to use the loop:

$ ./byteorder-special32 
my: 0a0b0c0d 0d0c0b0a 0a0b0c0d
    0a0b0c0d 0d0c0b0a 0a0b0c0d
$ ./byteorder32 
my: 0a0b0c0d 0d0c0b0a 0a0b0c0d
    0a0b0c0d 0d0c0b0a 0a0b0c0d

However, the same code on a 32 bit big endian system shows the bug - the
supposedly generic loop code actually reverses the word, whereas the
"special" code for little endian systems works:

$ ./byteorder32 
my: 0a0b0c0d 0d0c0b0a 0a0b0c0d
    0a0b0c0d 0a0b0c0d 0a0b0c0d
$ ./byteorder-special32 
my: 0a0b0c0d 0a0b0c0d 0d0c0b0a
    0a0b0c0d 0a0b0c0d 0a0b0c0d


And again on a mixed endian system (emulated PDP-11 running BSD 2.11*):

nick[63] ./byteorder
my: 0a0b0c0d 0d0c0b0a 0a0b0c0d
    0a0b0c0d 0b0a0d0c 0a0b0c0d
nick[64] ./byteorder-special 
my: 0a0b0c0d 0b0a0d0c 0c0d0a0b
    0a0b0c0d 0b0a0d0c 0a0b0c0d


>From this I infer that the fallback functions were never needed on mixed
endian architectures such PDP-11s, as the operating system always supplied
a correct htonl etc.


So the irony is that the special-purpose little endian code is actually
correct for all platforms (attached as byteorder-fixed.c)
An even better fix would be to use the generator macros BETOH and HTOBE
instead (attached as byteorder-macros.c) - ie

BETOH(my_ntohl,long)
BETOH(my_ntohs,short)
HTOBE(my_htonl,long)
HTOBE(my_htons,short)


I think that the best fix, however, is

a) to explicitly drop all remaining code that supports mixed-endian platforms
   (and remove it)**
b) then post v5.18.0 merge in the fixes in smoke-me/nicholas/genpacksizetables
   which eliminate the need for the support functions completely.

Nicholas Clark

*  PDP-11 emulator is part of simh: http://simh.trailing-edge.com/
   2.11BSD from http://www.ak6dn.dyndns.org/PDP-11/2.11BSD/

   Trivially easy to install on macports as 'simh' or Debian as 'simh'.
   Currently my Raspberry Pi is emulating a PDP-11. (Running the emulator on
   my laptop made the fan spin, which was annoying)

   No, I don't propose that anyone spends time trying to get Perl working on
   PDP-11. For starters, you'll need to find an ANSI C toolchain. The BSD
   cc is resolutely K&R, doesn't do function prototypes, or "%lX" or "%hx"
   format strings in printf.

** Unless someone is willing to fix blead so that it compiles again on such
   a platform, and then run a reasonably regular smoker to ensure that it
   stays compiling.


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About