develooper Front page | perl.perl5.porters | Postings from May 2013

Re: [perl #117327] Sequence (?#...) not recognized in regex

Thread Previous | Thread Next
From:
Karl Williamson
Date:
May 1, 2013 01:30
Subject:
Re: [perl #117327] Sequence (?#...) not recognized in regex
Message ID:
51807025.5010604@khwilliamson.com
On 04/30/2013 05:09 PM, Ricardo SIGNES via RT wrote:
> On Mon Apr 29 11:13:31 2013, public@khwilliamson.com wrote:
>> I am confident …
>
> Your confidence fills me with contentment!
>
>> What I am unsure of is if we should, so close to 5.18, be making
>> something a syntax error that previously wasn't.
>
> I have thought about this.
>
> Right now, we made an inconsistent behavior stay inconsistent, but different and, I think, worse.
>
> I considered these primary options:
>
> (a) the "just for-Heaven's-sake do it" option
>
> Here, we apply Karl's patch.  This eliminates inconsistency now, but adds more fatal cases.  I
> doubt anyone is strongly opposed to this change… except that it would be getting applied
> during the big scary 5.18.0 code freeze, and we have to consider that if "real code" is relying on
> the behavior of /( ?: abc)/ it has a greater chance of being untested and fixed "in time" if the new
> error is first found (other than blead) in 5.18.0-RC1.
>
> On the other hand, it closes the case on this bug, and it is quite tempting to believe that if any
> code is relying on this very sketchy area of the regex system, it's already a grotesque mess, and
> such code probably doesn't even *exist* anyway.  Unfortunately, such code clearly does exist:
>
> http://grep.cpan.me/?q=%5C%28%5Cs%2B%5C%3F%5B%21%3A%3E%5D
>
> As someone who loathes "just put more whitespace anywhere you like" grammars, I was sad to
> see my name in that list. ;-)  It looks to me like "compressed regex nonsense" is likely to be a
> factor here, which means "stuff breaking that nobody knows why or how or what to fix."
>
> (b) the "don't break backcompat" option
>
> Here, we revert the original change, with a firm plan to fix it thoroughly in the future.  We need
> to issue a notice that the current behavior is going to be corrected, so we do that now.  Then in
> 5.19.0, we strictly enforce the error, going to option (a).  We retain the "old weird," but add a
> warning, which has some chance of breaking code.  (Most likely, the cynic in me supposes, by
> breaking "no warnings" tests, while the code that would've been effected is itself untested.)
>
> (c) the "all things in moderation" option
>
> We leave things exactly as they are in blead, post-50485807, but alter the emitted error to say
> "you can't put space where you put it."  This puts any likely breakage's starting point back in
> January (before the user-visible change freeze), and we move to option (a) in 5.19.0.  There are
> two down sides, here:  first, we're adding a new syntax error where there wasn't one without
> warning; secondly, we're introducing inconsistency.  The inconsistency doesn't bother me *as
> long as we fix it ASAP*.  This is like "we fixed a bunch of parts of a bug, and will continue to fix
> its other parts."  This kind of half-fix is something I am expressly okay with, as we're doing
> timeboxed releases.
>
> That said, what's the win of (c) over (b)?  I'm not sure.  I don't think we really lose any momentum
> on the problem, nor do we return to some terrible state.
>
> SO:
>
> My original "all things equal, just fix the message" had in mind something like (c): Just change
> the text we emit so that we can release 5.18.0 and then fix things better.  Now it seems clear to
> me that we need a deprecation period for the bogus-space problem.  (Specifically, I'm thinking
> of code like that in the cpan grep, above.)
>
> I think option (a) is right out for 5.18.0.

I think you meant s/right/ruled/
>
> Karl also wrote:
>> I suppose we could tweak the patch so that it gives an error in
>> precisely the same places as the misleading error would, and lets all
>> the other problematic constructs go by, unwarned on, and then change it
>> in 5.19 to warn. I haven't investigated how hard this would be.
>
> Karl (or anybody) what is the difficulty in basically keeping the old behavior but issuing a
> warning on the problematic constructs *right away*, to be made fatal in 5.19.0-ish?  That is:
> can we enact option (b) in a timely way?
>

My original intent was to change the error message, but what I ended up 
with was the simplest thing to implement.  I discovered that you could 
legally have:

qr/((?#
      Thousand-
      line
      comment
      ...
      ...
  )?:foo)/

That seemed to me to indicate a real issue in the need of fixing.

I don't fully understand your final suggestion.  So what I think we 
should do may be what you are saying.  I think we should not apply my 
patch, but revert 504858073fe16afb61d66a8b6748851780e51432, Nicholas's 
patch that this bug bisects to, where he mistakenly thought the code was 
dead, but it wasn't.

If it is deemed ok to add a warning this late for 5.18, then my patch 
can trivially be modified to warn and not fail.  We can make that 
warning either default on or default off.


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About