develooper Front page | perl.libwww | Postings from February 2001

Re: / and DirectoryIndex

Thread Previous | Thread Next
From:
Reinier Post
Date:
February 21, 2001 04:10
Subject:
Re: / and DirectoryIndex
Message ID:
20010221130950.A21534@win.tue.nl
On Wed, Feb 21, 2001 at 04:42:20PM +0700, John Indra wrote:
> Hi all...
> 
> How do I tell my user-agent (an LWP::UserAgent object) to NOT download both
> / and index.html or whatever remote sites DirectoryIndex set to?
> Example, my user-agent sees 2 link:
> - http:://www.domain.com/

This :: notation is contagious :-)

> - http:://www.domain.com/index.html

> IF in this situation both link to the same document, my user-agent will be a
> fool if it tries to download both file. How do I make a "smarter" user-agent
> that will know that those 2 links are the same and only perform one GET
> method, either to http:://www.domain.com/ OR
> http:://www.domain.com/index.html?

The server won't tell you whether or not they're the same document.
You have the same problem with server aliases or symlinks: the whole
tree

   http://www.domain.com/a/butreally/b/*

may be identical to 

  http://www.domain.com/b/*

Depending on what you find on the server it may be possible to hypothesize
some heuristics, for instance, '*/index.html always has the same content
as */', but exceptions are always possible.  The only way to be really sure
is to check the document content, or at least the header.

-- 
Reinier

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About