Data::Dumper and large integers

Aaron Crane
December 30, 2015 15:14
TL;DR: I want to change Data::Dumper so that there are more cases in
which it avoids wrapping quotes around integers.

Consider this simple program:

use Config;
use Data::Dumper;
print for "$Config{ivsize}\n",
    Dumper(2_000_000_000, 4_000_000_000, 5_000_000_000);

On a 32-bit system it produces this output:

$VAR1 = 2000000000;
$VAR2 = 4000000000;
$VAR3 = '5000000000';

This isn't terribly surprising: $VAR1 is IOK, $VAR2 is UOK, and $VAR3
is NOK; and Data::Dumper always uses quotes to emit an NOK-only
scalar, but IOK and UOK scalars can be emitted as plain integers.

There is a surprise to come on a 64-bit system, though:

$VAR1 = 2000000000;
$VAR2 = 4000000000;
$VAR3 = 5000000000;

That's because all these numbers fit into an IV on such a platform, so
the values are all IOK, and so $VAR3 gets no quotes.

However, even a 64-bit platform uses quotes for an integer that needs
eleven digits:

$ perl -MData::Dumper -MConfig -e \
  'print for "$Config{ivsize}\n", Dumper(10_000_000_000)'
$VAR1 = '10000000000';

That's caused by a specific imposition of a ten-digit limit in the XS
implementation. (Well, a ten-character limit, strictly: the minus sign
for a negative number counts against the limit too, so all integers
-1e9 and below get quoted.)

Also, the pure-Perl implementation quotes all these numbers on all
platforms. Specifically, it uses quotes for all integers outside the
range -999_999_999 .. 999_999_999 (by matching against a suitable

As far as I can tell, these differences (both between 32-bit and
64-bit systems, and XS and pure-Perl implementations) aren't entirely
deliberate. One comment in Dumper.xs points out that "the pure perl
and XS non-qq outputs have historically been different". Another
comment, on the code that applies the ten-digit limit, says "Looks
like we're on a 64 bit system. Make it a string so that if a 32 bit
system reads the number it will cope better." The patch that
introduced that comment is here:

which also expands on the issue it's seeking to fix:

> On 64 bit perls XS code would dump very large integers as numbers.
> If fed to 32 bit perls these will immediately be treated as floating
> point, which will cause digits to be lost. Now they are dumped as strings,
> which will preserve digits in a 32 bit perl that uses them as a string.

But there are still many values that would be happily emitted without
quotes on a 64-bit system, but would be treated as floats on a 32-bit
system — every integer in the half-open range [2**32 .. 10e9). So
Data::Dumper definitely doesn't have the property that output
generated on a 64-bit Perl can be losslessly evaluated by a 32-bit
Perl (and that's been true for well over a decade).

I also think it's unlikely that downstream users have tight coupling
on the precise output DD generates for large integers, given the
existing differences caused by both integer size and implementation

I therefore propose to change Data::Dumper to emit all integers[*] in
the range IV_MIN .. UV_MAX without quotes, in both the PP and XS
implementations, even though the values of IV_MIN and UV_MAX are
platform- and configuration-dependent.

Any objections?

[*] Well, almost. When DD acquired an XS implementation of $Useqq,
some minor output changes did cause some BBC failures, and therefore
those differences were stamped out:

So I propose to leave the output under $Useqq unchanged.

