develooper Front page | perl.perl5.porters | Postings from July 2013

Re: Perl 5.18 and Regexp::Grammars

Thread Previous | Thread Next
Karl Williamson
July 1, 2013 17:05
Re: Perl 5.18 and Regexp::Grammars
Message ID:
On 07/01/2013 04:12 AM, Dave Mitchell wrote:
> On Thu, Jun 27, 2013 at 10:42:54PM -0600, Karl Williamson wrote:
>> On 06/27/2013 06:22 PM, Dave Mitchell wrote:
>>> I can reduce the failure to the following code:
>>>      m{(?&foo){0}(?<foo>)}
>>> which SEGVs during compilation. 'git blame' blames Karl.
>>> (which is not to say that there aren't other SEGing issues, which might
>>> involve code blocks, but this first SEGV is innocent of them).
>>> Karl, is this something you can deal with?
>>> v5.17.7.0-60-g3018b82
>>> 3018b823898645e44b8c37c70ac5c6302b031381 is the first bad commit
>>> commit 3018b823898645e44b8c37c70ac5c6302b031381
>> I believe I have found the problem.  I mistakenly combined what
>> should have been two commits in this one, and the commit message
>> doesn't even mention the one at fault.
>> What appears to be going on is that
>> (?&foo){0}
>> is getting replaced by a NOTHING node, which is an optimization we
>> agreed to on p5p.  Thus it doesn't set up 'foo', and so the
>> reference to it is NULL, which segfaults when dereferenced.
>> Suggestions as to how to proceed are welcome.  Obviously, we could
>> remove the optimization.  Or apply it when what is being replaced is
>> one of certain node types, or has some characteristic, like width.
>> I haven't looked into this at all.
> Not knowing the details of the optimisation, I would hazard the suggestion
> what we just remove the optimisation for now (hopefully a simple fix that
> can be done now and backported in time for 5.18.1), then worry about doing
> a better fix later on.

Now reverted by

commit 2e3a23da260a7ec5d61b81cb34c38de5e528b41d
  Author: Karl Williamson <>
  Date:   Mon Jul 1 10:26:14 2013 -0600

      Fix regex seqfault 5.18 regression

      This segfault is a result of an optimization that can leave the
      compilation in an inconsistent state.


      doesn't match anything, and hence should be removable from the 
regex for
      all f.  However,


      caused a segfault.  What was happening prior to this commit is that
      (?&foo) refers to a named capture group further along in the regex.
      The "{0}" caused the "(?&foo)" to be discarded prior to setting up the
      pointers between the two related subexpressions; a segfault follows.

      This commit removes the optimization, and should be suitable for a
      maintenance release.

      One might think that no one would be writing code like this, but this
      example was distilled from machine-generated code in Regexp::Grammars.

      Perhaps this optimization can be done, but the location I chose for
      checking it was during parsing, which turns out to be premature.  It
      would be better to do it in the optimization phase of regex 
      Another option would be to retain it where it was, but for it to 
      only on a limited set of nodes, such as EXACTish, which would have no
      unintended consequences.  But that is for looking at in the 
future; the
      important thing is to have a simple patch suitable for fixing this
      regression in a maintenance release.

      For the record, the code being reverted was mistakenly added by me in
      commit 3018b823898645e44b8c37c70ac5c6302b031381, and wasn't even
      mentioned in that commit message.  It should have had its own commit.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About