develooper Front page | perl.perl6.language.regex | Postings from December 2000

Re: Perl 5's "non-greedy" matching can be TOO greedy!

Thread Previous | Thread Next
From:
Kevin Walker
Date:
December 15, 2000 12:42
Subject:
Re: Perl 5's "non-greedy" matching can be TOO greedy!
Message ID:
v04210101b6602e5517ed@[207.170.238.224]
I wrote:

>More generally, it seems to me that you're hung up on the 
>description of "*?" as "shortest possible match".  That's an 
>ambiguous simplification of what "*?" means.  It might better be 
>described as "match until you find a match for the rest of the 
>regex" ('d' in your example).  If oversimplifications in the 
>documentation led you to believe that "*?" meant something it was 
>never intended to mean, then perhaps the documentation should be 
>clarified.

I should have added that when I first came across non-greedy regexes, 
I made exactly the same erroneous assumption (that assumption being, 
"*?" finds the shortest possible match, or at least the shortest 
possible local match).  Once I learned the actual meaning, I realized 
that is was more sensible than my initial naive interpretation.

Deven seems to be advocating thinking about regular expressions 
without worrying too much about the implementation, even at a fairly 
abstract level.  (By abstract level, I mean something like "keep 
matching non-newlines, until you come to a 'd'".)  I think this is a 
serious mistake.  In my younger, less experienced days, I showed 
great talent for writing regexes which, if not 
heat-death-of-the-universe-slow, were at least inefficient enough to 
exhaust Perl's available memory.  This happened when attempting to do 
fairly innocuous things, like extract the headers of a email message. 
Moral:  Ignore the underlying implementation at your peril.


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About