Front page | perl.perl5.porters |
Postings from July 2013
[perl #72784] [META] misuse of I32
Thread Previous
|
Thread Next
From:
Father Chrysostomos via RT
Date:
July 28, 2013 02:33
Subject:
[perl #72784] [META] misuse of I32
Message ID:
rt-3.6.HEAD-2552-1374978808-944.72784-15-0@perl.org
On Sat Feb 13 07:51:05 2010, nicholas wrote:
> There's a lot of inappropriate use of the I32 type throughout the core.
> Likely most should be something else, one of U32, STRLEN, SSize_t, IV
or UV.
>
> The misuse of I32 causes lots of bugs (panics, SEGVs, silent data
errors) if
> strings go over 2GB.
>
> This is a meta-ticket for collating tickets relating to these sorts of
bugs.
>
What should the maximum string and array lengths be?
In various places in the perl source, we have overflow checks that use
different sizes; in other places things just overflow, but at different
thresholds.
Take sv_setpvn for example. If the length is greater than IV_MAX it
croaks. safesysmalloc croaks if the length is greater than the maximum
SSize_t can hold, but only on debugging builds. Malloc takes a size_t
as its argument.
So that means 2**63-1 is the longest string supported on 64-bit
platforms via sv_setpvn. If you use sv_grow and write to the string
buffer directly, you can go beyond that, if the malloc implementation
lets you, except under debugging builds, where the limit is still 2**63-1.
On 32-bit platforms, if -Duse64bitint is not used, the situation is the
same, but with 2**31-1 for the maximum.
On 32-bit platforms when -Duse64bitint *is* used, sv_setpvn allows
values all the way up to 2**32-1, but sv_grow will croak on anything
above 2**31-1 under debugging builds. On non-debugging builds both
methods (sv_setpvn and sv_grow+direct write to PVX) allow strings up to
2**32-1.
So nothing is consistent.
For all practical purposes, one is limited to 31-bit string lengths on
32-bit platforms, because, unless you really know perl’s internals well,
your 3GB string *will* be copied, and you will immediately run out of
memory.
I suggest we make 2**(PTRSIZE-1)-1 (*) the maximum string length and
make everything consistent with that. That is already the maximum array
length.
Regular expressions are currently limited to I32_MAX. Changing that
would break the regular expression plugin interface. I don’t think it’s
unreasonable to keep it at its current limit. Note I am talking about
regular expressions themselves, not the string they match against.
In 6174b39a8 I allowed pos() to record positions up to 2**PTRSIZE-2. I
think that was a mistake, but harmless, as pos() values are always
truncated to the length of the string.
* I.e., SSize_t_MAX, but we have no such macro currently.
--
Father Chrysostomos
---
via perlbug: queue: perl5 status: open
https://rt.perl.org:443/rt3/Ticket/Display.html?id=72784
Thread Previous
|
Thread Next