develooper Front page | perl.perl5.porters | Postings from February 2019

Re: RFC: Use ptrdiff_t instead of SSize_t

Thread Previous | Thread Next
From:
Karl Williamson
Date:
February 16, 2019 19:18
Subject:
Re: RFC: Use ptrdiff_t instead of SSize_t
Message ID:
d3e4a894-35e5-12fd-9121-d2ed9579d6e4@khwilliamson.com
On 8/3/18 11:30 PM, Karl Williamson wrote:
> On 08/03/2018 04:48 PM, Tomasz Konojacki wrote:
>>
>> On Tue, 17 Jul 2018 14:29:49 +1000
>> Tony Cook <tony@develop-help.com> wrote:
>>
>>> On Mon, Jul 16, 2018 at 06:12:25PM -0600, Karl Williamson wrote:
>>>> I stumbled upon this detail in stack overflow:
>>>>
>>>> "The Open Group Base Specifications Issue 7, IEEE Std 1003.1, 2013 
>>>> Edition,
>>>> description of <sys/types.h> says:
>>>>
>>>> The type ssize_t is capable of storing values at least in the range 
>>>> [-1,
>>>> SSIZE_MAX]."
>>>>
>>>> https://stackoverflow.com/questions/8649018/what-is-the-difference-between-ssize-t-and-ptrdiff-t 
>>>>
>>>>
>>>> That means it doesn't necessarily work for the difference between 
>>>> two ptrs,
>>>> which is something I have used it for.
>>>>
>>>> But ptrdiff_t must be capable of storing any such difference.  Also, 
>>>> any
>>>> object's size can be expressed as the difference between its ending and
>>>> starting pointers, so ptrdiff_t must be able to store anything ssize_t,
>>>> STRLEN or plain size_t can store, as well.
>>>
>>> Not exactly, ptrdiff_t is signed while STRLEN and size_t are unsigned.
>>>
>>> The obvious difference that STRLEN/size_t can store larger positive
>>> numbers than ptrdiff_t, but the more subtle difference is that
>>> overflow for unsigned integer types is well defined, while it causes
>>> undefined behaviour for signed integer types.
>>>
>>> Tony
>>
>> C11 says:
>>
>>> When two pointers are subtracted, both shall point to elements of the 
>>> same array object,
>>> or one past the last element of the array object; the result is the 
>>> difference of the
>>> subscripts of the two array elements. The size of the result is 
>>> implementation-defined,
>>> and its type (a signed integer type) is ptrdiff_t defined in the 
>>> <stddef.h> header.
>>> If the result is not representable in an object of that type, the 
>>> behavior is undefined.
>>
>> There are many ways to interpret this passage, but according to (most?)
>> C compilers developers, it means that no object can be larger than
>> PTRDIFF_MAX. For example, gcc's optimizer assummes that strlen() will
>> never return anything larger than PTRDIFF_MAX [1].
>>
>> There's also a blogpost[2] on this topic, which IMO is a very
>> interesting read.
>>
>> If gcc and clang can assume that all objects won't be larger than
>> PTRDIFF_MAX, so can we. Also, in practice, ssize_t and ptrdiff_t on most
>> (all?) platforms are defined as exactly the same type.
>>
>> BTW, the fact that compilers assume that objects can't be larger than
>> PTRDIFF_MAX has very dangerous implications on 32-bit platforms. Is it
>> possible to create string longer than PTRDIFF_MAX on 32-bit perls?. It
>> shouldn't be allowed.
> 
> The C standard is poor here.  One reading is that that you can have 
> objects that are bigger than PTRDIFF_MAX, and if you happen to subtract 
> pointers from the wrong two elements, the results are undefined, whereas 
> subtracting from the right two ones gives valid defined results.  And 
> there is no convenient way to know whether you have a valid result or 
> not.   I don't like this phrase, but it seems apt in this case: "Down 
> this path lies madness".  So of course, a practical implementation can't 
> have that some times the results are defined and some times not, for a 
> valid object.  That means that a practical object can't be larger than 
> PTRDIFF_MAX.  The implication is that SIZE_MAX really shouldn't be 
> larger than PTRDIFF_MAX (though in my Linux gcc box it is twice the 
> size), and that is hard to do since size_t is unsigned and ptrdiff_t is 
> signed.  The C99 rationale suggests this dilemma can be solved by making 
> ptrdiff_t "long long".
> 
> I think Tomasz put it very well when he said that if gcc and clang 
> assume something, then so can we.
> 
> I don't think we should be using ssize_t.  On most platforms, it doesn't 
> matter, but on some it could, and ptrdiff_t is the safer choice.
>>
>> [1] - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78153
>> [2] - https://trust-in-soft.com/objects-larger-than-ptrdiff_max-bytes/
>>
> 

For now, I added code to prevent mallocs larger than PTRDIFF_MAX.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About