develooper Front page | perl.libwww | Postings from March 2011

Re: How to ignore a named character reference in TreeBuilder?

Thread Previous
From:
Webley Silvernail
Date:
March 17, 2011 17:06
Subject:
Re: How to ignore a named character reference in TreeBuilder?
Message ID:
776952.29735.qm@web120608.mail.ne1.yahoo.com




* Webley Silvernail wrote:
>I have some XHTML in utf-8 that includes the named character reference   
>for non-breaking spaces.

(That is a numeric, not a named, character reference.)
>> Yes, sorry I used the wrong term.

>The output is replacing the   with a non-printable character that is 
>rendered in various agents as boxes or question marks enclosed in diamonds.

(That like means you've not specified the character encoding properly.)

>I've tried adding an explicit decode/encode step and using HTML::Entities, but 
>I've had no luck.  Basically, I just want TreeBuilder to ignore the   
>references and pass them through. 

I believe you are looking for the $element->as_HTML($entities) parameter
(see `perldoc HTML::Element` for details; set the parameter to a value
that includes all the characters you want to be escaped in the output).
>> Thanks for pointing this out....this is what I needed. It took me a while to 
>>figure out what to specify for $entities, but '\xa0' worked in the end. 
>>
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 





Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About