develooper Front page | perl.beginners | Postings from December 2003

Problems with LWP::UserAgent

Thread Next
From:
Dan Anderson
Date:
December 24, 2003 13:05
Subject:
Problems with LWP::UserAgent
Message ID:
m2fzfancz7.fsf@syr-24-59-76-83.twcny.rr.com

        I am trying to create a  spider to grab my books off of Safari
for a  batch printing job so I  don't need to go  through each chapter
myself and hit the Print button.  So I used this script to try and log
myself in to the safari site:

# BEGIN CODE
#! /usr/bin/perl

use strict;
use warnings;
use LWP;
use LWP::UserAgent;

# variables
my $cookie_jar_file = "./cookies.txt";
my @headers = (
		  'User-Agent'      => 'Mozilla/4.76 [en] (Win98; U)',
		  'Accept'          => 'image/gif, image/x-bitmap, image/jpeg,
                                        image/pjpeg, image/png, */*',
		  'Accept-Charset'  => 'iso-8859-1,*',
		  'Accept-Language' => 'en-US',
				  "catid" => "",
				  "s" => "1",
				  "o" => "1",
				  "b" => "1",
				  "t" => "1",
				  "f" => "1",
				  "c" => "1",
				  "u" => "1",
				  "r" => "",
				  "l" => "1",
				  "g" => "",
				  "usr" => "myemail",
				  "pwd" => "mypassword",
				  "savepwd" => "1",
		 );
# end variables

my $user_agent = LWP::UserAgent->new;
$user_agent->cookie_jar({file => $cookie_jar_file});
my $response = $user_agent->post(
				 'http://safari.oreilly.com/JVXSL.asp',
				 @headers,
				 );
# END CODE

        Now I know that this is the form I should post to because
I stripped the following forms out of the web page (and there is
no Javascript to modify the forms):

<form action="JVXSL.asp" method="post">
<input type="hidden" name="catid" value="">
<input type="hidden" name="s" value="1">
<input type="hidden" name="o" value="1">
<input type="hidden" name="b" value="1">
<input type="hidden" name="t" value="1">
<input type="hidden" name="f" value="1">
<input type="hidden" name="c" value="1">
<input type="hidden" name="u" value="1">
<input type="hidden" name="r" value="">
<input type="hidden" name="l" value="1">
<input type="hidden" name="g" value="">
<input name="usr" type="text" value="" size="12">
<input name="pwd" type="password" value="" size="12">
<input type="checkbox" name="savepwd" value="1">
<input type="image" name="Login" src="images/btn_login.gif" width="40" height="16" border="0" align="absmiddle">
</form>

        When I pull up this web page there's nothing in
$response->content.  I know that safari.oreilly.com will return a
blank page if it doesn't like the user agent, and upon signing in
it'll return to the safari.oreilly.com page with a very large number
of get variables.  Does anyone know what I might be doing wrong?

        Also, I figured I'm not the only person who would want to do
this.  Anyone interested in starting up a Sourceforge project with me
and releasing it under the GPL?

-Dan



Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About