develooper Front page | perl.beginners | Postings from February 2002

regex to parse HTML files

Thread Next
sachin balsekar
February 26, 2002 06:17
regex to parse HTML files
Message ID:
Hi ppl,

I have one HTML file per News story...i got to fetch some data (first
few lines) out from a HTML file and display it as an abstract for the
said story...

The HTML file have the following issues...

1. There could be a HTML table at the very beginning..(can i strip out
the whole table..i mean <TABLE **** </TABLE> ...but may cause probs in
nested tables...(trying regex for the same)...

2. I need to pick up first 1/2 lines..( i look for a '.' and pickup
data) but fails for acronyms/ numbers.. (Ltd. or 5.8% etc)

These two issues solved could get the almost thru with the prob..

Please help...


To unsubscribe, e-mail:
For additional commands, e-mail:

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About