develooper Front page | perl.beginners | Postings from February 2002

Stripping HTML

Thread Next
From:
Daniel Falkenberg
Date:
February 26, 2002 21:17
Subject:
Stripping HTML
Message ID:
3ACA70B144BD6D45B994CAC2CA4B9F98017CE9@opal.vintek.local
Hello All,

Could some one help me with stripping out the following from the
following HTML?  

<tr>
 <td><b>Find this one</b>&nbsp;</td>
 <td><tt>one 1</tt>&nbsp;</td>
</tr>
 <tr>
 <td><b>Find this two</b>&nbsp;</td>
 <td><tt>two 2</tt>&nbsp;</td>
</tr>
 <tr>
 <td><b>Find this three</b>&nbsp;</td>
 <td><tt>three 3</tt>&nbsp;</td>
</tr>

Basically using HTML::TableExtract I really want to extract....

one 1
two 2
three 3

So far I have the following code...

    $inputSite = "<URL>";
    $tree = HTML::TreeBuilder->new;
    $address = "http://" . $inputSite;
    $request = HTTP::Request->new('GET', $address);
    $response = $ua->request($request);
    my $found = 0;

    $tree->parse($response->content);
    $html_string = $tree->as_HTML;
    $te = new HTML::TableExtract( headers => [qw(one two three)] );
    $te->parse($html_string);
    foreach $ts ($te->table_states) {
      foreach $row ($ts->rows) {
        $mRow = "@$row";
      }
    }
    print $mRow;

Is there any obvious errors in this?

Regards,

Dan

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About