develooper Front page | perl.libwww | Postings from September 2008

Effect of Namespaces in XHTML on Headers

Thread Next
From:
Phil Archer
Date:
September 1, 2008 02:16
Subject:
Effect of Namespaces in XHTML on Headers
Hi,

I've used LWP in several apps in which the key bit of information I'm 
after is the headers. I've therefore got used to the fact that if the 
returned resource is HTML, one of the triggers for "OK, that's all the 
headers and everything else must be content" is the presence of anything 
in the <head> section of the document that LWP doesn't recognise.

Take this, for example:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html 
xmlns:creativeCommons='http://backend.userland.com/creativeCommonsRssModule'
  xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="en-US">

<creativeCommons:license>http://creativecommons.org/licenses/by-nc-nd/3.0/</creativeCommons:license>

<head profile="http://gmpg.org/xfn/11">
...

Perfectly valid XHTML - but... LWP doesn't recognise the 
<creativecommons... tag and so stops parsing the headers.

The User Agent package I'm using is version 2.31

So, some questions:

1. Which modules need updating so that LWP can recognise this kind of 
thing as valid <head> content

2. Has anyone written such a module?

As a demonstration, [1] and [2] show the status line, headers_as_string 
and content from two versions of the same document, the only difference 
between the two being that in [2], the <creativecommons..> tag is 
commented out. You can get this output from any URI using the form at [3].

Thanks for any help

Phil.

[1] 
http://www.icra.org/cgi-bin/HTTP_Headers.cgi?url=http%3A%2F%2Fwww.icra.org%2Flabel%2FHTTP-Test%2Fspace.htm
[2] 
http://www.icra.org/cgi-bin/HTTP_Headers.cgi?url=http%3A%2F%2Fwww.icra.org%2Flabel%2FHTTP-Test%2Fspace-mod.htm
[3] http://www.icra.org/label/HTTP-Test/

-- 
Phil Archer
Chief Technical Officer,
Family Online Safety Institute
w. http://www.fosi.org/people/philarcher/

Register now for the annual Family Online Safety Institute Conference 
and Exhibition, December 11th, 2008, Washington, DC.
See http://www.fosi.org/conference2008/


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About