develooper Front page | perl.beginners | Postings from January 2002

Re: How do I read a web page from within perl?

Thread Previous | Thread Next
From:
John
Date:
January 30, 2002 08:28
Subject:
Re: How do I read a web page from within perl?
Message ID:
jUsT.aNoTheR.mEsSaGe.iD.10124079696298@jpw3.com
I'm sure I'll get some crap for this old snippy-quick coding, but 
here is an example, if you are interested in HTTP 200, etc. codes 
(this is cut and paste from a function, so no header, etc.), or for 
more flexibility.  I never got https to work, I think the firewall 
is closed to 443.

    use IO::Socket;
    use Sys::Hostname;

    my $ip = shift( @_ );
    my $port = shift( @_ );
    my $hostname = hostname();
    my $retval = "";
    my $protocol = "HTTP";

    if ( $port == 443 )
    {
        $protocol = "HTTPS";
    }

    my @pages = @_;

    foreach my $page ( @pages )
    {
        chomp( $page );

        if ( $page !~ m#..# )
        {
            next;
        }

        my $remote = IO::Socket::INET->new  (   Proto       => "tcp",
                                                PeerAddr    => $ip,
                                                PeerPort    => "${protocol}($port)",

                                            );

        if ( ! $remote )
        {
            $retval .= ( "ERROR : $hostname : unable to create $protocol 
socket to $ip\n\n" );
        }
        else
        {
            $remote->autoflush( 1 );
            $page =~ s#\\#/#g;

            if ( $page !~ m#^/# )
            {
                $page = "/${page}";
            }

            my $get = "GET $page ${protocol}/1.0\n\n";
            print $remote "$get";
            chomp( $get );
            my $line = <$remote>;
            my @rest = <$remote>;

#            if ( $line !~ m#^HTTP.?/.* (200 OK|302 Object Moved)#i )
#            if ( $line !~ m#HTTP.?/.* (200 OK|302 Object Moved)#i )

            if ( $line =~ m#HTTP.?/.* (404 Object Not Found)#i )
            {
                $retval .= "ERROR : $protocol $get from $ip returns 
${line}\n";
            }
            else
            {
                $retval .= "${protocol}://${ip}:${port}${page} : 
$line\n";
            }

            close( $remote );
        }
    }

    return( $retval );


At Wednesday, 30 January 2002, "Collins, Joe (EDSI\\BDR)" <JCollins2@exchange.
ml.com> wrote:

>For example, suppose I want to capture www.cnn.com into an array
>and process the text. How does one do this?
>
>Many thanks,
>
>Joe
>
>-- 
>To unsubscribe, e-mail: beginners-unsubscribe@perl.org
>For additional commands, e-mail: beginners-help@perl.org
>








Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About