develooper Front page | perl.beginners | Postings from January 2002

Strange (from my perspective) regex behavior

Thread Next
From:
Peter Cline
Date:
January 4, 2002 12:57
Subject:
Strange (from my perspective) regex behavior
Message ID:
4.3.2.7.2.20020104154005.00aa3be0@mailgate.nytimes.com
I am trying to extraxt some text from a file using a regular 
expression.  It is not behaving as expected and am totally perplexed as to why.
Here is an excerpt of the text

1. Top Story: Dynegy in Agreement to Get Enron Pipeline
2. M&A: Newmont-Normandy, Hewlett-Compaq, Pax TV, WorldCom
3. Investment Banking: Goldman, Sandler, Merrill Lynch
4. I.P.O.s/Offerings: Sirius Satellite Radio, Neuer Markt
5. Venture Capital: Lucent-Coller Capital, EM.TV
6. Private Equity: HSBC, Canada 3000, Edel Music, Kumho Tire
7. Legal: GE Capital Aviation, EchoStar-DirecTV
8. Correction: Daily Deal Echostar-DirecTV  Article


/------------------advertisement--------------\

I want to extract the numbered list.

here is the regex I am using to do it:
m!((\d\.\s\D+)+)/[-]+advertisement!

For some reason this starts matching at number 7.  If I eliminate 
everything after / the regex matches from 1 to the / in item 4.

I am totally perplexed as to why this is happening.  If anyone has insite, 
I would be most appreciative.

Thanks
Peter Cline
Inet Developer
New York Times Digital 
Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About