develooper Front page | perl.libwww | Postings from April 2001

Re: HTML::Entities

Thread Next
Gisle Aas
April 11, 2001 09:53
Re: HTML::Entities
Message ID:
Robin Berjon <> writes:

> I bumped into a problem today using the HTML::Entities module. I'm dealing
> with some XHTML into which I insert hidden input fields, no rocket science
> there. In order to protect the content of the fields, I'm encoding them.
> The problem occurs because the XHTML uses ' (&apos;) as attribute value
> delimitres -- legal in XML -- but HTML::Entities doesn't encode those by
> default. In fact, it doesn't seem to know about &apos;

The reason HTML::Entities doesn't know about &apos; is that it's not
mentioned in the HTML specs:

It is part of XHTML, because it is part of XML.

A quick test with some HTML browsers I had access to reveals:

   Netscape 4.76  don't know about it
   Netscape 6 does
   Konqueror 1.9.8 doesn't known about it
   Lynx 2.8.3 decoded it as &#96; instead of &#39;

Given this quick survey, I think it would be unwise to just add it to
HTML::Entities unless we can make it so that it only affects decoding.
It seems more correct to continue to encode ' as &#39;

> It's not a big problem for me as I know how to work around it, and I was
> inches away from submitting a patches, but I was wondering if there was a
> good reason why you hadn't included "'" in the list of default encoded
> characters ? I believe it belongs there with '"', the latter being in the
> list precisely because of attribute values, which can be delimited by both.

But HTML spec only mentions '"' so I think it makes sense to stick
with it for now.  Especially if we continue to encode ' as &#39;.


Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About