develooper Front page | perl.libwww | Postings from March 2001

URI and spidering unique docs

Thread Next
From:
Bill Moseley
Date:
March 28, 2001 23:58
Subject:
URI and spidering unique docs
Message ID:
3.0.3.32.20010328235845.022f9448@pop3.hank.org
Oh my, I'm writing yet another spider, for some reason.

I'd like to only spider documents one time.  So I'm using a hash of
URI->canonical keys.

Although I realize these *could* be two different docs, they are not on our
server:
     http://localhost/path/to/my/file.html
     http://localhost/path/to/../to/my/file.html

Any (URI?) tricks to seeing those as the same document?


Bill Moseley
mailto:moseley@hank.org

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About