Front page | perl.perl5.porters |
Postings from July 2013
Re: Perl 5.18 and Regexp::Grammars
Thread Previous
|
Thread Next
From:
Karl Williamson
Date:
July 1, 2013 17:05
Subject:
Re: Perl 5.18 and Regexp::Grammars
Message ID:
51D1B6D5.7060404@khwilliamson.com
On 07/01/2013 04:12 AM, Dave Mitchell wrote:
> On Thu, Jun 27, 2013 at 10:42:54PM -0600, Karl Williamson wrote:
>> On 06/27/2013 06:22 PM, Dave Mitchell wrote:
>>> I can reduce the failure to the following code:
>>>
>>> m{(?&foo){0}(?<foo>)}
>>>
>>> which SEGVs during compilation. 'git blame' blames Karl.
>>> (which is not to say that there aren't other SEGing issues, which might
>>> involve code blocks, but this first SEGV is innocent of them).
>>>
>>> Karl, is this something you can deal with?
>>>
>>> v5.17.7.0-60-g3018b82
>>>
>>> 3018b823898645e44b8c37c70ac5c6302b031381 is the first bad commit
>>> commit 3018b823898645e44b8c37c70ac5c6302b031381
>>
>> I believe I have found the problem. I mistakenly combined what
>> should have been two commits in this one, and the commit message
>> doesn't even mention the one at fault.
>>
>> What appears to be going on is that
>>
>> (?&foo){0}
>>
>> is getting replaced by a NOTHING node, which is an optimization we
>> agreed to on p5p. Thus it doesn't set up 'foo', and so the
>> reference to it is NULL, which segfaults when dereferenced.
>>
>> Suggestions as to how to proceed are welcome. Obviously, we could
>> remove the optimization. Or apply it when what is being replaced is
>> one of certain node types, or has some characteristic, like width.
>> I haven't looked into this at all.
>
> Not knowing the details of the optimisation, I would hazard the suggestion
> what we just remove the optimisation for now (hopefully a simple fix that
> can be done now and backported in time for 5.18.1), then worry about doing
> a better fix later on.
>
Now reverted by
commit 2e3a23da260a7ec5d61b81cb34c38de5e528b41d
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Jul 1 10:26:14 2013 -0600
Fix regex seqfault 5.18 regression
This segfault is a result of an optimization that can leave the
compilation in an inconsistent state.
/f{0}/
doesn't match anything, and hence should be removable from the
regex for
all f. However,
qr{(?&foo){0}(?<foo>)}
caused a segfault. What was happening prior to this commit is that
(?&foo) refers to a named capture group further along in the regex.
The "{0}" caused the "(?&foo)" to be discarded prior to setting up the
pointers between the two related subexpressions; a segfault follows.
This commit removes the optimization, and should be suitable for a
maintenance release.
One might think that no one would be writing code like this, but this
example was distilled from machine-generated code in Regexp::Grammars.
Perhaps this optimization can be done, but the location I chose for
checking it was during parsing, which turns out to be premature. It
would be better to do it in the optimization phase of regex
compilation.
Another option would be to retain it where it was, but for it to
operate
only on a limited set of nodes, such as EXACTish, which would have no
unintended consequences. But that is for looking at in the
future; the
important thing is to have a simple patch suitable for fixing this
regression in a maintenance release.
For the record, the code being reverted was mistakenly added by me in
commit 3018b823898645e44b8c37c70ac5c6302b031381, and wasn't even
mentioned in that commit message. It should have had its own commit.
Thread Previous
|
Thread Next