Front page | perl.p5ee |
Postings from October 2002
Re: mod_gzip vs. Apache::Compress
From: Stephen Adkins
October 4, 2002 11:01
Re: mod_gzip vs. Apache::Compress
Message ID: firstname.lastname@example.org
At 11:57 PM 9/27/2002 -0400, Stephen Adkins wrote:
>Does anyone have experience with automatic compression with the
>output of mod_perl/CGI programs to speed up end-user performance?
I found this thread over on the mod_perl mailing list recently on
dynamic compression of Apache content.
The MS Word document he refers to can be found here.
Its contents follow below.
Web Content Compression FAQ
by Slava Bizyayev (email@example.com)
This FAQ is written mainly for Internet content provider management
familiar with Internet traffic issues and network equipment and its cost.
This document may also be informative for ISP system administrators and
webmasters seeking to improve throughput and bandwidth efficiency.
Q: Why it is important to compress web content?
A: Reduced equipment costs and the competitive advantage of dramatically
faster page loads.
Web content compression noticeably increases delivery speed to clients and
may allow providers serve higher content volumes without increasing
hardware expenditures. It visibly reduces actual content download time, a
benefit most apparent to users of dialup and high-traffic connections.
Q: How much improvement can I expect?
A: Effective compression can achieve increases in transmission efficiency
from 3 to 20 times.
The compression ratio is highly content-dependent. For example, if the
compression algorithm is able to detect repeated patterns of characters,
compression will be greater than if no such patterns exist. You can
usually expect to realize an improvement between of 3 to 20 times on
file compression improvements in excess of more than 200 times, but such
occurrences are infrequent. On the other hand I have never seen ratios of
less then 2.5 times on text/HTML files. Image files normally employ their
own compression techniques that reduce the advantage of further compression.
On May 21, 2002 Peter J. Cranstone wrote to the firstname.lastname@example.org
mailing list: "
With 98% of the world on a dial up modem, all they care
about is how long it takes to download a page. It doesn't matter if it
consumes a few more CPU cycles if the customer is happy. It's cheaper to
buy a newer faster box, than it is to acquire new customers."
Q: How hard is it to implement content compression on an existing site?
A: Implementing content compression on an existing site typically involves
no more that installing and configuring an appropriate Apache handler on
the Web server.
This approach works in most of the cases I have seen. In some special
cases you will need to take extra care with respect to the global
architecture of your web application, but such cases may generally be
readily addressed by through various techniques. To date I have found no
fundamental barriers to practical implementation of Web content compression.
Q: Does compression work with standard Web browsers?
A: Yes. No client side changes or settings are required.
All modern browsers makers claim to be able to handle compressed content
and are able to decompress it on the fly, transparent to the user. There
are some known bugs in some old browsers, but these can be taken into
account through appropriate configuration of the Web server.
Q: What the software is required on server side?
A: There are six known modules/packages for the Web content compression
available to date (in alphabetic order):
" Apache::Compress - a mod_perl handler developed by Ken Williams (U.S.)
" Apache::Dynagzip - a family of mod_perl handlers, developed by Slava
Bizyayev - a Russian programmer residing in the U.S.
" Apache::Gzip - an example of mod_perl content handler developed by
Lincoln Stein and Dough MacEachern for their book Writing Apache Modules
with Perl and C (U.S.)
" Apache::GzipChain - a mod_perl handler developed by Andreas Koenig
" mod_deflate - an Apache handler written in C by Igor Sysoev (Russia).
" mod_gzip - an Apache handler written in C. Original author: Kevin Kiley,
Remote Communications, Inc. (U.S.)
In February 2002, Nicholas Oxhøj wrote to the email@example.com mailing
list about his own experience to find the appropriate Apache gzipping tool
for streaming outbound content:
"... I have been experimenting with all the different Apache compression
modules I have been able to find, but have not been able to get the desired
result. I have tried Apache::GzipChain, Apache::Compress, mod_gzip and
mod_deflate, with different results. One I cannot get to work at all. Most
work, but seem to collect all the output before compressing it and sending
it to the browser...
.... Wouldn't it be nice to have some option to specify that the handler
should flush and send the currently compressed output every time it had
received a certain amount of input or every time it had generated a certain
amount of output?..
.... So I am basically looking for anyone who has had any success in
achieving this kind of "streaming" compression, who could direct me at an
appropriate Apache module."
Unfortunately for him, the Apache::Dynagzip package was not then publicly
Q: What makes Apache::Dynagzip more effective than earlier packages?
A: Apache::Dynagzip is most useful when one needs to compress dynamic
outbound Web content (generated on the fly from databases, XML, etc.) when
content length is not known at the time of the request.
Apache::Dynagzip's features include:
" Support for both HTTP/1.0 and HTTP/1.1.
" Control over the chunk size on HTTP/1.1 for on-the-fly content compression.
" Support for any Perl, Java, or C/C++ CGI applications.
" Advanced control over the proxy cache with Vary: HTTP header.
" Optional control over the content's life-length in client's local cache
with Expire: HTTP header.
" Optional extra-light compression (removal of leading blank spaces and/or
blank lines), which works for all browsers, including older ones that
cannot uncompress gzip format.
" Optional support for server-side caching of the dynamically generated
(and compressed) content.