develooper Front page | perl.beginners | Postings from February 2002

Reading HTML Files for Data.

Thread Previous
From:
sachin balsekar
Date:
February 25, 2002 19:46
Subject:
Reading HTML Files for Data.
Message ID:
3C7B0839.2070303@myiris.com
Hi ppl,

I have one HTML file per News story...i got to fetch some data (first
few lines) out from a HTML file and display it as an abstract for the
said story...

The HTML file have the following issues...

1. There could be a HTML table at the very beginning..(can i strip out
the whole table..i mean <TABLE **** </TABLE> ...but may cause probs in
nested tables...(trying regex for the same)...

2. I need to pick up first 1/2 lines..( i look for a '.' and pickup
data) but fails for acronyms/ numbers.. (Ltd. or 5.8% etc)

These two issues solved could get the almost thru with the prob..

Please help...

Regs,
sgb




Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About