develooper Front page | perl.libwww | Postings from April 2003

Re: $ua->parse_head and gzip encoding

Thread Previous
From:
Mike Simons
Date:
April 16, 2003 11:37
Subject:
Re: $ua->parse_head and gzip encoding
Message ID:
20030416183050.GB14936@moria.simons-clan.com
On Wed, Apr 16, 2003 at 08:16:10AM -0700, Bill Moseley wrote:
> On Wed, 16 Apr 2003, Mike Simons wrote:
> > - Why don't you want the header parse done on a compressed document?
> 
> LWP::Protocol only checks for text/html but not encoding so it attempts to
> parse with HTML::HeadParser encoded content.  I think that's what I saw...

  Okay, I see the LWP/Protocol.pm collect function creating a
HTML::HeadParser... I can this might be a problem if the content is
compressed, I haven't looked at the parse method itself to see if it
will bail silently when it gets non-html.
  I guess I didn't run into this problem because the decompression was
happening at a much lower level in my code...


> >   I posted a patch to transparently request and decompress (block by
> > block) gzip style documents a little while ago.  The user who creates
> > a UserAgent can request the transparent decompression with a option
> > like "WantCompression => 1".
> 
> That's good.  My quick code doesn't really work for me, though, because I
> do use a UserAgent callback function.  It would be good to have the chunk
> of content passed to the callback uncompressed as well.
> 
> Do you remember the title or URL for the archived thread?

  I don't know the URL because I don't read these lists on the web.
I included the title date and from of the thread with the patch in my
last email, here again... 

# Date: Tue, 25 Mar 2003 02:10:15 -0500
# From: Mike Simons <msimons@moria.simons-clan.com>
# Subject: Re: Net::HTTP does not use compressed transfers when it should


> I agree that it should be configurable.  Sometimes you will want the
> encoded form returned (but knowing the the head was not parsed).

  If someone requests the LWP stuff use compression the content should
be decoded for them.  If the user wants to use compression but get the
content back compressed they would add a 'Accept-encoding' header to the
request themselves.  do you agree?


> I'd also
> like a way to know if the running version of LWP can decode, although I
> suppose seeing a Content-Encoding header would tell my code that it needed
> to do the decoding myself.

  My code didn't "fix" any of the headers... for example, the
first element of content-encoding header should be removed if it decoding
and the content-length header should be removed until the decompression is 
done.

    Later,
      Mike Simons

-- 
GPG key: http://simons-clan.com/~msimons/gpg/msimons.asc

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About