develooper Front page | perl.perl6.language.regex | Postings from December 2000

Re: Perl 5's "non-greedy" matching can be TOO greedy!

Thread Previous | Thread Next
Deven T. Corzine
December 15, 2000 13:09
Re: Perl 5's "non-greedy" matching can be TOO greedy!
Message ID:

On Fri, 15 Dec 2000, Tom Christiansen wrote:

> >We may have to "agree to disagree".  
> I shan't be doing that.

Well, I'm still willing to discuss it, as long as it remains a discussion
and doesn't become a flame war.

> >I'm understand why people believe in
> >the current semantics, but I've seen no indication that anyone else
> >understands why I believe in these alternative semantics, or has tried.
> >(Disagreeing with my conclusion doesn't preclude understanding where I'm
> >coming from, but nobody seems to.)
> You have not addressed the heat death of the universe as I and
> others have illustrated.  Finding all possible matches is very often
> completely infeasible.  Please solve the electron decay problem
> before continuing.

Where does the heat death of the universe come in?  I can give you a SIMPLE
way to implement it, but I doubt it's the best way: apply the current rules
first, then take the matching substring and search within THAT for a match
with the priority of the rules inverted -- prefer non-greediness OVER
leftmost matching for the second pass.  This WILL get the result I suggest,
preferring leftmost matching in general while still maximizing the amount
non-greediness (stinginess) within those constraints.

At worst, this should take no more than double the amount of time that the
single pass did, probably less.  Hardly a cause to concern ourselves with
the heat death of the universe.

Note, I do NOT recommend that implementation; it imposes an obvious speed
penalty that shouldn't be imposed on people who don't care.  It might make
sense as an option, however.

However, it does bring another possibility to mind.  For those who are
willing to pay a 100% speed penalty for simplicity, this sort of two-pass
mode could be allowed, and allow juggling of both preferences?  Maybe it
would be useful to allow a "rightmost matching preference" option for
people who could use it.  (Might be helpful when working with Hebrew?)

> >Well, obviously we could.  Maybe we shouldn't, but we could do it.  Many,
> >many existing programs depended on Perl 4's magic behavior with @'s in
> >double-quoted strings, yet Perl 5 broke them all with a fatal error during
> >the compile phase.  People survived.  They adapted and moved on.  
> Red herring.

Counterexample to the assumption that we can't break existing code by
changing the semantics.  It's been done before, it could happen again.

> >Unlike that incompatibility, this one would probably affect few
> >programs.
> You're wrong.  Incredibly wrong.  

Really?  Do you have a real-world example that it would break, which would
demonstrate how common such breakage would be?


Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About