develooper Front page | perl.perl5.porters | Postings from January 2017

Re: [perl #130648] regcomp.c:6195: voidS_pat_upgrade_to_utf8(RExC_state_t *const, char **, STRLEN *, int): Assertion`*(d - 1) == ')'' failed

Thread Previous | Thread Next
From:
Dave Mitchell
Date:
January 30, 2017 16:42
Subject:
Re: [perl #130648] regcomp.c:6195: voidS_pat_upgrade_to_utf8(RExC_state_t *const, char **, STRLEN *, int): Assertion`*(d - 1) == ')'' failed
Message ID:
20170130164215.GF8158@iabyn.com
On Sun, Jan 29, 2017 at 08:17:33AM -0800, Hugo van der Sanden via RT wrote:
> On Thu, 26 Jan 2017 02:19:19 -0800, randir wrote:
> > While fuzzing perl v5.25.9-35-g32207c637b built with afl and run
> > under libdislocator, I found the following 16-bytes program
> > 
> > hexdump -C 0042
> > 00000000  6d 27 5c 34 30 30 28 3f  7b 3c 3c 7d 29 0a 0a 27
> > |m'\400(?{<<})..'|
> > 00000010
> > 
> > to cause an assertion failure.
> 
> We're hitting S_pat_upgrade_to_utf8() with a code block of
> "(?{<<})\n\n". My initial suspicion is that that's fine, and the
> assumption that the last char of such a code block must be ')' is wrong,
> but I don't know.

Hmmm... the assertion is correct, the toker is very wrong.

When compile-time code is seen in a pattern, the code is parsed, so that
for

    /abc(?{...})def/

the toker returns this sequence of tokens:

    FUNC, '(', const("abc"), 'DO', '{', ...., '}, '(?{...})', 'def', ')'

As well as the individual parsed tokens for the code block, the text of
the code block is returned afterwards as a separate const op, which is
used by re_op_compile() to reconstruct the original text of the regex
(in case a regex is ever stringified).

The problem with

    m{\x{100}(?{<<EOF})
    x
    EOF
    }

is that the stringification of the code block is being returned by yylex()
as

    "(?{<<EOF})\nx\nEOF"

rather than what I'd expect:

    "(?{\"x\n\"})"

(or similar).

But to a certain extent it depends on how heredocs are supposed to operate
within regex codeblocks, and how such regexes are supposed to stringify.
I think FC did a lot of fixups in this area recently.

This is all too horrible to contemplate at the moment.


-- 
My Dad used to say 'always fight fire with fire', which is probably why
he got thrown out of the fire brigade.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About