develooper Front page | perl.perl5.porters | Postings from October 2015

Re: [perl #116639] regex optimiser wrongly rejects certain matchesinvolving embedded comments

Thread Previous | Thread Next
From:
demerphq
Date:
October 14, 2015 08:05
Subject:
Re: [perl #116639] regex optimiser wrongly rejects certain matchesinvolving embedded comments
Message ID:
CANgJU+W3aO+B4eL7wrVbtUw8__TpO6rWR68n0G3-GMagP7uXCQ@mail.gmail.com
On 14 October 2015 at 00:42, Karl Williamson via RT
<perlbug-followup@perl.org> wrote:
> I may have closed this prematurely.  I had not read the extensive commentary on this when I closed it, only the original report.  So I had forgotten the controversy over what should happen.
>
> To recap what has happened in blead:  It turns out that no one (including me) thought about nextchr()'s behavior when the pattern is UTF-8 encoded.  It did a simple ++ of the parse position, which is the wrong thing to do when the character is a multi-byte character.  It would point to the 2nd byte of that, hence the tests it did after the increment for white space under /x would fail for white space that was multi-byte.  When I tried to write tests after fixing that, I discovered that nothing I came up with would reliably fail.  And valgrind showed that there reads outside the buffer of garbage data.  That led to me fixing a bunch of nextchr calls, and that led to making all such stuff uniform.  And that led to this bug being fixed.
>
> But do we really want a (?#...) comment between a character and its quantifier?

IMO no.

Yves



-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About