develooper Front page | perl.libwww | Postings from October 2001

SV: HTML::Parser question

From:
=?iso-8859-1?Q?Jonas_Nordstr=F6m?=
Date:
October 30, 2001 04:41
Subject:
SV: HTML::Parser question
Message ID:
960FCE94AC10D1119EFF00A02416D7EA024EC3F0@queen.es.sigma.se
HTML::Parser->new(text_h => [sub {$buf .= shift },
'dtext'])->parse($content) || die $!;

/Jonas Nordström


-----Ursprungligt meddelande-----
Från: ADJE WebMail Technical Support Team [mailto:support@adjeweb.com]
Skickat: den 30 oktober 2001 05:00
Till: libwww@perl.org
Ämne: HTML::Parser question


Question: How do I extract the plain text from an HTML file, or, put
another way, how do I remove the html markups, just leaving the plain
text?  I have looked at the example provided in HTML::Parser, in
particular

HTML-Parser-3.25/eg/htext

which comes close to what I need, however, I would like to store the
plain text in a variable, as opposed to having it to STDOUT (standard
output).... any ideas??

ADJE WebMail Technical Support
http://www.adjeweb.com
support@adjeweb.com

-->> FREE Perl CGI scripts add WEB ACCESS to your
-->> POP E-Mail accounts! Download today!! http://www.adjeweb.com



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About