develooper Front page | perl.beginners | Postings from April 2012

Re: Regex again..

Thread Previous | Thread Next
From:
Shlomi Fish
Date:
April 14, 2012 02:33
Subject:
Re: Regex again..
Message ID:
20120414123313.22590b17@lap.shlomifish.org
Hi Somu,

On Sat, 14 Apr 2012 12:56:03 +0530
Somu <som.ctc@gmail.com> wrote:

> *Hi all,
> I was trying to strip off all html tags and the special characters from a
> html file using regex.
> my code is as follows..

please don't use regular expressions to parse and process HTML:

* 
http://perl-begin.org/FAQs/freenode-perl/#I_need_to_parse_HTML_with_Perl_.28and_my_Regular_Expression_does_not_work.29

(short URL - http://xrl.us/bm3p8u ).

*
http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags

(Especially the first comment which is very amusing).

Please use a proper HTML parser or follow Dr. Ruud's advice to use lynx in this
case.

Regards,

	Shlomi Fish

-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
Funny Anti-Terrorism Story - http://shlom.in/enemy

I’d love to change the world, but they won’t give me the source code.
    — Unknown

Please reply to list if it's a mailing list post - http://shlom.in/reply .

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About