develooper Front page | perl.perl5.porters | Postings from July 2013

[perl #72784] [META] misuse of I32

Thread Previous | Thread Next
Father Chrysostomos via RT
July 28, 2013 02:33
[perl #72784] [META] misuse of I32
Message ID:
On Sat Feb 13 07:51:05 2010, nicholas wrote:
> There's a lot of inappropriate use of the I32 type throughout the core.
> Likely most should be something else, one of U32, STRLEN, SSize_t, IV
or UV.
> The misuse of I32 causes lots of bugs (panics, SEGVs, silent data
errors) if
> strings go over 2GB.
> This is a meta-ticket for collating tickets relating to these sorts of

What should the maximum string and array lengths be?

In various places in the perl source, we have overflow checks that use
different sizes; in other places things just overflow, but at different

Take sv_setpvn for example.  If the length is greater than IV_MAX it
croaks.  safesysmalloc croaks if the length is greater than the maximum
SSize_t can hold, but only on debugging builds.  Malloc takes a size_t
as its argument.

So that means 2**63-1 is the longest string supported on 64-bit
platforms via sv_setpvn.  If you use sv_grow and write to the string
buffer directly, you can go beyond that, if the malloc implementation
lets you, except under debugging builds, where the limit is still 2**63-1.

On 32-bit platforms, if -Duse64bitint is not used, the situation is the
same, but with 2**31-1 for the maximum.

On 32-bit platforms when -Duse64bitint *is* used, sv_setpvn allows
values all the way up to 2**32-1, but sv_grow will croak on anything
above 2**31-1 under debugging builds.  On non-debugging builds both
methods (sv_setpvn and sv_grow+direct write to PVX) allow strings up to

So nothing is consistent.

For all practical purposes, one is limited to 31-bit string lengths on
32-bit platforms, because, unless you really know perl’s internals well,
your 3GB string *will* be copied, and you will immediately run out of

I suggest we make 2**(PTRSIZE-1)-1 (*) the maximum string length and
make everything consistent with that.  That is already the maximum array

Regular expressions are currently limited to I32_MAX.  Changing that
would break the regular expression plugin interface.  I don’t think it’s
unreasonable to keep it at its current limit.  Note I am talking about
regular expressions themselves, not the string they match against.

In 6174b39a8 I allowed pos() to record positions up to 2**PTRSIZE-2.  I
think that was a mistake, but harmless, as pos() values are always
truncated to the length of the string.

* I.e., SSize_t_MAX, but we have no such macro currently.


Father Chrysostomos

via perlbug:  queue: perl5 status: open

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About