develooper Front page | perl.perl6.language.regex | Postings from December 2000

Re: Perl 5's "non-greedy" matching can be TOO greedy!

Deven T. Corzine
December 14, 2000 15:02
Re: Perl 5's "non-greedy" matching can be TOO greedy!
Message ID:

On Thu, 14 Dec 2000, Jeff Pinyan wrote:

> On Dec 14, Deven T. Corzine said:
> >> You're asking for something like
> >> 
> >>   /(?<!b)(b.*?d)/
> >> 
> >> which is an "optimization" you'll have to incorporate on your own.
> >
> >Thanks for the example.  Unfortunately, your attempted workaround doesn't
> >even work for the example string; the "a" preceding "bbbbccccd" isn't a
> >"b", so the regexp engine is still perfectly happy with the same match.
> >Even if it worked as you intended, it would have failed with something like
> >"bbbabbbccccdddd", since the ".*?" would happily match "bbabbbcccc"...
> Sorry, I was thinking backwards (when I was supposed to think forwards).
> Using:
>   /(b(?!b).*?d)/
> is what I'd meant to say.

True, this one works, and I should have thought of it.  Unlike the wrong
example of /(b[^b].*?d)/, using the zero-width lookahead assertion fixes
the problem for this regexp, even for matches of "bd".  (Again, not all
regexps will be as simple.)

> I wouldn't call the current behavior a design flaw, though.

I do.  From an implementation standpoint, it may have been more convenient
than getting the semantics EXACTLY right, and the differences are subtle
enough that it's debatable how much effort is worth expending on fixing the
problem.  That's why I said it was "understandable", but it remains a flaw.

I haven't even SEEN an example where the current behavior is actually
preferable than my proposed behavior, have you?  (And I'd expect at least a
FEW, though I suspect there are probably more counterexamples.)

This is certainly a minor issue, but I believe the design is flawed in the
current Perl 5 implementation; why not fix the design properly for Perl 6?

Deven Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About