develooper Front page | perl.perl5.porters | Postings from October 2015

[perl #116639] regex optimiser wrongly rejects certain matchesinvolving embedded comments

Thread Next
Karl Williamson via RT
October 13, 2015 22:42
[perl #116639] regex optimiser wrongly rejects certain matchesinvolving embedded comments
Message ID:
I may have closed this prematurely.  I had not read the extensive commentary on this when I closed it, only the original report.  So I had forgotten the controversy over what should happen.

To recap what has happened in blead:  It turns out that no one (including me) thought about nextchr()'s behavior when the pattern is UTF-8 encoded.  It did a simple ++ of the parse position, which is the wrong thing to do when the character is a multi-byte character.  It would point to the 2nd byte of that, hence the tests it did after the increment for white space under /x would fail for white space that was multi-byte.  When I tried to write tests after fixing that, I discovered that nothing I came up with would reliably fail.  And valgrind showed that there reads outside the buffer of garbage data.  That led to me fixing a bunch of nextchr calls, and that led to making all such stuff uniform.  And that led to this bug being fixed.

But do we really want a (?#...) comment between a character and its quantifier?  I can see both sides of the issue, so am now bringing it up to discussion again.  blead is now in a state where it would be easy to add the ability to choose which places allow (?#...) and which forbid it, but allow white space and regular # comments, both only under /x.  We could allow (?#...) only under /x in such cases if we choose.  It's easy to change it to do any of this, and I'm willing to do the work, once a decision has been made as to what to do.

My only stance on this is that I think (but am convince-able the other way) that under /x, anywhere there is a # comment, should also allow a (?#...) comment
Karl Williamson

via perlbug:  queue: perl5 status: pending release

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About