Front page | perl.beginners |
Postings from December 2003
Problems with LWP::UserAgent
Thread Next
From:
Dan Anderson
Date:
December 24, 2003 13:05
Subject:
Problems with LWP::UserAgent
Message ID:
m2fzfancz7.fsf@syr-24-59-76-83.twcny.rr.com
I am trying to create a spider to grab my books off of Safari
for a batch printing job so I don't need to go through each chapter
myself and hit the Print button. So I used this script to try and log
myself in to the safari site:
# BEGIN CODE
#! /usr/bin/perl
use strict;
use warnings;
use LWP;
use LWP::UserAgent;
# variables
my $cookie_jar_file = "./cookies.txt";
my @headers = (
'User-Agent' => 'Mozilla/4.76 [en] (Win98; U)',
'Accept' => 'image/gif, image/x-bitmap, image/jpeg,
image/pjpeg, image/png, */*',
'Accept-Charset' => 'iso-8859-1,*',
'Accept-Language' => 'en-US',
"catid" => "",
"s" => "1",
"o" => "1",
"b" => "1",
"t" => "1",
"f" => "1",
"c" => "1",
"u" => "1",
"r" => "",
"l" => "1",
"g" => "",
"usr" => "myemail",
"pwd" => "mypassword",
"savepwd" => "1",
);
# end variables
my $user_agent = LWP::UserAgent->new;
$user_agent->cookie_jar({file => $cookie_jar_file});
my $response = $user_agent->post(
'http://safari.oreilly.com/JVXSL.asp',
@headers,
);
# END CODE
Now I know that this is the form I should post to because
I stripped the following forms out of the web page (and there is
no Javascript to modify the forms):
<form action="JVXSL.asp" method="post">
<input type="hidden" name="catid" value="">
<input type="hidden" name="s" value="1">
<input type="hidden" name="o" value="1">
<input type="hidden" name="b" value="1">
<input type="hidden" name="t" value="1">
<input type="hidden" name="f" value="1">
<input type="hidden" name="c" value="1">
<input type="hidden" name="u" value="1">
<input type="hidden" name="r" value="">
<input type="hidden" name="l" value="1">
<input type="hidden" name="g" value="">
<input name="usr" type="text" value="" size="12">
<input name="pwd" type="password" value="" size="12">
<input type="checkbox" name="savepwd" value="1">
<input type="image" name="Login" src="images/btn_login.gif" width="40" height="16" border="0" align="absmiddle">
</form>
When I pull up this web page there's nothing in
$response->content. I know that safari.oreilly.com will return a
blank page if it doesn't like the user agent, and upon signing in
it'll return to the safari.oreilly.com page with a very large number
of get variables. Does anyone know what I might be doing wrong?
Also, I figured I'm not the only person who would want to do
this. Anyone interested in starting up a Sourceforge project with me
and releasing it under the GPL?
-Dan
Thread Next
-
Problems with LWP::UserAgent
by Dan Anderson