develooper Front page | perl.perl5.porters | Postings from October 2013

NWCLARK TPF grant report #106

From:
Nicholas Clark
Date:
October 3, 2013 13:53
Subject:
NWCLARK TPF grant report #106
Message ID:
20131003135257.GN4940@plum.flirble.org
[Hours]		[Activity]
2013/09/09	Monday
 5.75		HAS_QUAD
 0.25		sort.pm RT #88546, RT #88547
=====
 6.00

2013/09/10	Tuesday
 4.75		HAS_QUAD
 0.75		RT #114494 (PERL_MICRO)
 1.25		The Unicode bug
 0.25		smoke-me branches
 0.50		t/re/reg_mesg.t
=====
 7.50

2013/09/11	Wednesday
 0.75		BmUSEFUL
 3.75		HAS_QUAD
=====
 4.50

Which I calculate is 18.00 hours

Tony's work on RT #115928 revealed something rather strange. The core has
a macro HAS_QUAD, described thus:

    /* HAS_QUAD:
     *      This symbol, if defined, tells that there's a 64-bit integer type,
     *      Quad_t, and its unsigned counterpart, Uquad_t. QUADKIND will be one
     *      of QUAD_IS_INT, QUAD_IS_LONG, QUAD_IS_LONG_LONG, QUAD_IS_INT64_T,
     *      or QUAD_IS___INT64.
     */

However, it turns out that reality doesn't match this description. Whilst
HAS_QUAD (previously QUAD) dates from mid perl-4 times, proper detection of
it via Configure dates from this commit:

    commit de1c2614758a00c0d93fc45414417a54cdf923b3
    Author: Jarkko Hietaniemi <jhi@iki.fi>
    Date:   Sat Oct 30 12:41:50 1999 +0000
    
        Add HAS_QUAD ($Config{d_quad}); use it.
    
        p4raw-id: //depot/cfgperl@4497


However, in addition to all the logic to set macros appropriately, that
commit also adds the "#undef HAS_QUAD" in this:

which adds the #undef into

    #  if IVSIZE == 8
    #    define IV_IS_QUAD
    #    define UV_IS_QUAD
    #    ifndef HAS_QUAD
    #      define HAS_QUAD
    #    endif
    #  else
    #    undef IV_IS_QUAD
    #    undef UV_IS_QUAD
    #    undef HAS_QUAD
    #  endif


which has the effect of pretending that there isn't actually a "Quad" type
on the platform if you're building with 32 bit IVs. (Which is the default on
a platform with 32 bit longs and pointers.) So that one undef undoes the
good portability work in precisely the place where it would be most useful -
platforms where a 64 bit type is available, but the built perl isn't using
it.

It feels like it's wrong as-is, in that we ought to leave HAS_QUAD defined,
so that XS code can take advantage of 64 bit types if present. As ever, life
is not that simple.

If you simply remove that line on a 32 bit platform built with
-Duse64bitint, the build fails. utf8.h makes the choice between 7 or 13 byte
extended sequences depending on HAS_QUAD, but the "HAS_QUAD" code then
assumes a different macro is defined, which is only defined for 64 bit UVs.

Fix that and the build completes, but tests for sprintf and pack fail. The
printf tests fail because printf's parsing of "long long" formats is
conditional on 64 bit support:

$ perl -MConfig -e 'printf "%d %lld\n", $Config{ivsize}, $Config{uvsize}'
8 8
$ ./perl -Ilib -MConfig -e 'printf "%d %lld\n", $Config{ivsize}, $Config{uvsize\
}'
4 %lld

(which I think is pretty ugly in the first place), and *that* is expressed
(badly) in terms of HAS_QUAD, which is being used as a proxy for "are IVs
64 bits".

The pack failures are more frustrating still. The 'q' and 'Q' pack formats
are fatal errors on a perl with 32 bit IVs:

$ perl -le 'print length pack "q", ~0'
8
$ ./perl -le 'print length pack "q", ~0'
Invalid type 'q' in pack at -e line 1.

but the C code in pp_pack.c (again) is enabled with HAS_QUAD, not a test of
IVSIZE. The code itself is written on the assumption that it will be enabled
with 32 bit IVs, because it falls back to NVs. This works:

#ifdef HAS_QUAD
        case 'q':
            while (len-- > 0) {
                Quad_t aquad;
                SHIFT_VAR(utf8, s, strend, aquad, datumtype, needs_swap);
                if (!checksum)
                    mPUSHs(aquad >= IV_MIN && aquad <= IV_MAX ?
                           newSViv((IV)aquad) : newSVnv((NV)aquad));
                else if (checksum > bits_in_uv)
                    cdouble += (NV)aquad;
                else
                    cuv += aquad;
            }
            break;


The problem is that the pack tests were never written to cope with this, and
fail. They could be fixed. More troubling though, is that t/op/64bitint.t
also fails, because it starts like this:

    eval { my $q = pack "q", 0 };
    skip_all('no 64-bit types') if $@;


It (sadly) seems to be a moderately common idiom on CPAN to determine whether
IVs are 64 bit by using pack 'q' rather than the (more) correct approach of
asking %Config. (Heck, or just comparing ~0 with 0xFFFFFFFF)

http://grep.cpan.me/?q=eval.%2Bpack%5B%5EA-Za-z0-9%5D%2B%5BqQ%5D

(not the best search term, but gets rid of most the false positives. Looks
like at least 10 modules use this approach, and would break)

So I think that 'q' and 'Q' support has to stay coupled to IV size. :-(


With all of those fixed, it's possible to remove that #undef, keep HAS_QUAD
defined, and all the core the test pass. As to CPAN...
 
There are around 20 distributions using HAS_QUAD in ways that might effect
their code:
 
http://grep.cpan.me/?q=HAS_QUAD%5Cb+-file%3Appport.h+-file%3Aconfig_H.gc
 
Most build and pass tests on an unmodified perl. Of those that do, two fail
if HAS_QUAD is enabled, both from the same author. They seem to be carefully
written, with the intent of actually making use of the 64 bit type if
present. One makes the same mistake as the perl core repeats, using #ifdef
HAS_QUAD when it should use #if UVSIZE >= 8.

The other is more involved - it defines a type WTYPE, which is intended to
be the widest type available. Unfortunately, if HAS_QUAD changes to being
defined on an installation with 32 bit UVs, it fails to compile due to a lot
of prototype mismatches, because many functions are declared in the header
as returning UV, but the definition uses WTYPE. Fix all of those, and it
builds, but the tests fail, due (much like t/op/pack.t) to assuming that
Perl-space integer maths can be done to the same precision as C maths.


With those two modules patched, all CPAN modules that build on (unmodified)
blead also build with the change. However, we didn't feel that it was worth
actually making the change, and removing the undef. It would have been 10
years ago, but these days, pretty much everything which wants 64 bit integer
support is simply using long long:
 
http://grep.cpan.me/?q=long+long%5B%5E%27%5D%5B%5E%5Cn%5D*%3B+-file%3Appport.h 
 
(233 distributions found, but a few are false positives) 


So in the end, I merged the fixes, but did not enable HAS_QUAD.

(Which turned out to be useful pretty soon after, as we found it be
necessary to enable HAS_QUAD while building the core C code, to avoid rand()
returning garbage on HP-UX, which, of all things, was triggered when Tony's
work on RT #115928 was merged back.)

Nicholas Clark



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About