develooper Front page | perl.libwww | Postings from April 2003

Download file from HTTPS w/ LWP ?

Thread Next
Mark Sutfin
April 16, 2003 13:15
Download file from HTTPS w/ LWP ?
Message ID:
Perl 5.6.1
LWP 5.64
and Crypt-SSLeay


I would like to programmatically download a file from the USPS website. It
is a secure site (URI is https...), and requires authentication, cookies (or
both), I'm not sure..and while reading my new book (Perl and LWP by S.
Burke), it's not apparent to me how to tell..

The file(s) I seek contains address changes and undeliverables. We use it to
update our db.

I have successfully retrieved a list of available files from the website.
Then I bounced this off our db, to identify the files that I have not
seen/processed.  However, I receive a 401 when trying to download the
file(s). The error message is quite self explanatory, but I'm not able thus
far to translate that into code changes.. Any help solving the 401 and/or
code suggestions (style, efficiency, relevant pages in FM) are most welcome.

Error 401.2
meta name="robots" content="noindex"
<META HTTP-EQUIV="Content-Type" CONTENT="text/html;
<h2>HTTP Error 401</h2>
401.2 Unauthorized: Logon Failed due to server configuration
This error indicates that the credentials passed to the server do not match
the credentials required to log on to the server. This is usually caused by
not sending the proper WWW-Authenticate header field.
Please contact the Web server's administrator to verify that you have
permission to access to requested resource

The following code works fine to access the file list, but there's something
different between GETting this page, and downloading a file...

	# Get the web page file list to compare to files processed yesterday
	$ua = LWP::UserAgent->new;
	$req = HTTP::Request->new(GET =>
	$req->authorization_basic('$user', '$pswd');
	@content = $ua->request($req)->as_string;
	$content[0] =~ s/<\/pre>|<br>|<hr>|<\/body>/ /g;
	$content[0] =~ s/\s+/ /g;
	$pattern = "</A>";
	@content_split = split(/$pattern/,$content[0]);

@content now contains create date, size and file name... which are assigned
to local variables and compared to the file info stored in the db...

Once a filename is found that needs to be processed, I attempt to download
it with the following.

	# Specify output file (target) for download
	my $outfile = "c:\\".$file;	
	open(OUTFILE, ">$outfile") or die "Can't create a file $outfile:

	# Specify the file to download
	$file = '288790SN.102';
	$internet_file =
	# Request
	$resp = HTTP::Request->new(GET =>
	# tried this variation for the request.....
	#	$resp = $browser->get( $internet_file, 
	#				':content_file' => $outfile,
	#	);
	$resp->authorization_basic'$user' , '$pswd');
	# Tried this variation for the authorization..
	#	$resp = $browser->credentials(
	#		'',
	#		'',
	#		'$user' => '$pswd'
	#	);
	print $resp->content; 	
	die "Couldn't get the internet file ", $resp->status_line
		unless $resp->is_success;

Mark Sutfin

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About