develooper Front page | perl.perl5.porters | Postings from July 2018

Re: [perl #133365] perl 5.28.0 core: Negative array index read inutf8.c and regexec.c

Thread Previous | Thread Next
From:
Karl Williamson
Date:
July 12, 2018 20:05
Subject:
Re: [perl #133365] perl 5.28.0 core: Negative array index read inutf8.c and regexec.c
Message ID:
25f5dc88-2256-663e-4aa2-a0329a25eba9@khwilliamson.com
On 07/12/2018 12:32 PM, James E Keenan via RT wrote:
> On Thu, 12 Jul 2018 13:45:32 GMT, marc-philip.werner@sap.com wrote:
>> To: perlbug@perl.org
>> Subject: Negative array index read in utf8.c and regexec.c
>> Message-Id: <5.28.0_81188_1531401517@WDFM33972517A>
>> From: marc-philip.werner@sap.com
>> Reply-To: marc-philip.werner@sap.com
>>
>>
>> This is a bug report for perl from marc-philip.werner@sap.com,
>> generated with the help of perlbug 1.41 running under perl 5.28.0.
>>
>>
>> -----------------------------------------------------------------
>> Hi,
>> this is about perl 5.28.0. We found this with a coverity scan. Maybe
>> it's paranoid, but I'd still like to let you know. The code looks
>> different in blead, but it looks as if the problem is still there.
>>
>> In utf8.c, line 3672 Perl__invlist_search is called. It can return -1.
>> This return value is used as an array index in the next line.
>> In regexec.c, line 10387, Perl__invlist_search is also called and the
>> return value is used as array index without any check if it's
>> negative.
>>
>> I'm attaching a patchfile. It's at least good to show what I'm aiming
>> at.
>>
>> T&R
>> Marc-Philip
>>
> 
> I have created a branch for smoke-testing this patch:
> 
> smoke-me/jkeenan/mpwerner/133365-negative-array-index
> 
> (I don't have a position on the correctness of either the diagnosis or the solution.)
> 
> Thank you very much.
> 


The reason it isn't checked is because it can't happen.  I suppose we 
could panic.  If it were to return negative it would mean the hardware 
or the memory or something was so corrupt that soldiering on wouldn't 
make sense.

The reason it can't return negative is because there is an entry 
covering every single representable code point in a UV, and you can't 
get physically be outside that range.

The reason I know it covers every single code point, is that it is part 
of an inversion map.  An inversion map contains two parallel arrays. 
One is the inversion list you are seeing, and the other is a mapping 
from every single code possible code point to whatever it is being 
mapped to.  The arrays are parallel in that the nth entry in one 
corresponds to the nth entry in the other.

Someone who is familiar with inversion maps would know immediately that 
it can't return negative.  Perhaps a comment could be inserted to that 
effect here, and elsewhere that they are used.  Perhaps there is already 
such a comment somewhere in the source; I didn't check.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About