Front page | perl.beginners |
Postings from April 2008
Re: web scraping
Thread Previous
From:
Octavian Rasnita
Date:
April 28, 2008 13:53
Subject:
Re: web scraping
First search with search.cpan.org for "Finance" without quotes and see if
you can't find a module that downloads the data you want, and if you don't,
you can use LWP::UserAgent or WWW::Mechanize and regular expressions to do
it.
A very simple example that gets the title of Google's page:
use LWP::Simple;
my $content = get("http://www.google.com/");
my ($title) = $content =~ /<title[^>]*>(.*?)<\/title[^>]*>/gsi;
print $title;
Octavian
----- Original Message -----
From: "Rob Dixon" <rob.dixon@gmx.com>
To: <beginners@perl.org>
Cc: "Alex Goor" <a_goor@yahoo.com>
Sent: Monday, April 28, 2008 9:15 PM
Subject: Re: web scraping
> Alex Goor wrote:
>> I was hoping to write a simple program (if that's possible) to open a
>> browser, go to a site, and scrape a piece of information from that
>> site.
>>
>> For example, I was hoping to open a Safari of Firefox browser, go to
>> nyt.com and scrape the Dow Jones Industrial Average which is on the
>> homepage.
>>
>> Does anyone know where I could get an example program that does this
>> kind of thing to teach myself the concepts?
>
> Driving an actual Web browser is awkward and unnecessary unless the page
> you want cannot be handled with a Perl module.
>
> Take a look at WWW::Mechanize and see if it suits your purpose.
>
> Rob
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> For additional commands, e-mail: beginners-help@perl.org
> http://learn.perl.org/
>
>
Thread Previous