develooper Front page | perl.perl5.porters | Postings from October 2015

Re: [perl #116639] regex optimiser wrongly rejects certain matchesinvolving embedded comments

Thread Previous | Thread Next
From:
Eric Brine
Date:
October 15, 2015 00:29
Subject:
Re: [perl #116639] regex optimiser wrongly rejects certain matchesinvolving embedded comments
Message ID:
CALJW-qFtHHOd3QTkSpuL8cYOzxHvcKQ9U3vwdoRfmu1xAoxk+A@mail.gmail.com
On Wed, Oct 14, 2015 at 4:05 AM, demerphq <demerphq@gmail.com> wrote:

> On 14 October 2015 at 00:42, Karl Williamson via RT
> <perlbug-followup@perl.org> wrote:
> > I may have closed this prematurely.  I had not read the extensive
> commentary on this when I closed it, only the original report.  So I had
> forgotten the controversy over what should happen.
> >
> > To recap what has happened in blead:  It turns out that no one
> (including me) thought about nextchr()'s behavior when the pattern is UTF-8
> encoded.  It did a simple ++ of the parse position, which is the wrong
> thing to do when the character is a multi-byte character.  It would point
> to the 2nd byte of that, hence the tests it did after the increment for
> white space under /x would fail for white space that was multi-byte.  When
> I tried to write tests after fixing that, I discovered that nothing I came
> up with would reliably fail.  And valgrind showed that there reads outside
> the buffer of garbage data.  That led to me fixing a bunch of nextchr
> calls, and that led to making all such stuff uniform.  And that led to this
> bug being fixed.
> >
> > But do we really want a (?#...) comment between a character and its
> quantifier?
>
> IMO no.
>
> Yves
>

So the follow question would be: What should happen when someone does that?

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About