spacer

Webref WebRef   Sitemap · Experts · Tools · Services · Newsletters · About i.com

home / web / internet / software / servers / http / compression

123
Developer News
Microsoft Shows Some Ankle With Visual Studio
Gentoo Linux Cancels Distribution
It's Official: Windows 7 at PDC, WinHEC

HTTP Compression Speeds up the Web

Technical Overview

HTML/XML/JavaScript/text compression: Does it make sense?

The short answer is "only if it can get there quicker." In 99% of all cases it makes sense to compress the data. However there are several problems that need to be solved to enable seamless transmission from the server to the consumer.

  • Compression should not conflict with MIME types
  • Dynamic compression should not effect server performance
  • Server should be smart enough to know whether the user’s browser can decompress the content

Let's create a simple scenario. An HTML file which contains a large music listing in the form of a table.

The file is 679,188 bytes in length.

Let's track this download over a 28K modem and then compare the results before and after compression. The theoretical throughput over a 28K modem is 3,600 bytes per second. Reality is more like 2,400 bytes per second but for the sake of this article we will work at the theoretical maximum. If there was no modem compression then the file would download in 188.66 seconds. On the average with modem compression running we can expect a download time of about 90 seconds which indicates about a 2:1 compression factor. The total number of packets transmitted from modem to modem effectively "halved" the file size. But note that the server still had to keep open the TCP/IP sub system to "send" all the bytes to the modem for transmission. What happens if we can compress the data prior to transmission from the server. The file is 679,188 bytes in length. If we can compress it using standard techniques (which are not optimized for HTML) then we can expect to see the file be compressed down to 48,951 bytes. This is a 92.8% compression factor. We are now transmitting only 48,951 bytes (plus some header information which should also be compressed but that's another story). Modem compression no longer plays a factor because the data is already compressed.

Where are the performance improvements?

  • Bandwidth is conserved
  • Compression consumes only a few milliseconds of CPU time
  • The server's TCP/IP subsystem only has to serve 48,851 bytes to the modem
  • At a transfer rate of 3,600 bytes per second the file arrives in 13.6 seconds instead of 90 seconds

Compression clearly makes sense as long as it's seamless and doesn't kill server performance.

What else remains to be done?

A lot! Better algorithms need to be invented that compress the data stream more efficiently than gzip. Remember gzip was designed before HTML came along. Any technique which adds a new compression algorithm will require a thin client to decode and possibly tunneling techniques to enable it "firewall friendly." To sum up we need:

  1. Improved compression algorithms optimized specifically for HTML/XML
  2. Header compression. Every time a browser requests a page it sends a header file. In the case of WAP browsers header information can be as high as 900 bytes. With compression this can be reduced by 20-25%, to less than 700 bytes, as the redundancy in these headers is very low.
  3. Compression for WAP. (Currently WAP/WML does not support a true entropy encoding technique. It uses binary encoding to compress the tags while ignoring the content.)
  4. Dynamic compression for caching servers. (Download RCTPD Web Accelerator for all caching servers. http://www.remotecommunications.com/rctpd/)
  5. Real time compression/encryption with tunneling.

Further Reading

# # # # #

About the author: Peter Cranstone was a Co-Founder and the Chief Software Architect of HyperSpace Communications, Inc., a software company dedicated to data acceleration technology. He was also a Founder and Principal of The James Group, another company engaged in the development of advanced data compression algorithms. Mr. Cranstone has spent most of his professional career as a technological innovator and inventor. Mr. Cranstone is the co-inventor of two patent-pending applications covering the HyperSpaceR smart engine and the ElseWare Messaging Alert System which allows for Web-enabled devices to be controlled via simple e-mail. He can be reached at cranstone@msn.com.

123

Comments are welcome



JupiterOnlineMedia

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

Solutions
Whitepapers and eBooks
IBM Whitepaper: Innovative Collaboration to Advance Your Business
Internet.com eBook: Real Life Rails
Avaya Article: Call Control XML - Powerful, Standards-Based Call Control
Internet.com eBook: The Pros and Cons of Outsourcing
Go Parallel Article: Scalable Parallelism with Intel(R) Threading Building Blocks
Internet.com eBook: Best Practices for Developing a Web Site
IBM CXO Whitepaper: The 2008 Global CEO Study "The Enterprise of the Future"
Avaya Article: Call Control XML in Action - A CCXML Auto Attendant
Go Parallel Article: James Reinders on the Intel Parallel Studio Beta Program
IBM CXO Whitepaper: Unlocking the DNA of the Adaptable Workforce--The Global Human Capital Study 2008
Adobe Acrobat Connect Pro: Web Conferencing and eLearning Whitepapers
Go Parallel Article: Getting Started with TBB on Windows
HP eBook: Storage Networking , Part 1
MORE WHITEPAPERS, EBOOKS, AND ARTICLES
Webcasts
Go Parallel Video: Intel(R) Threading Building Blocks: A New Method for Threading in C++
HP Video: Is Your Data Center Ready for a Real World Disaster?
Microsoft Partner Portal Video: Microsoft Gold Certified Partners Build Successful Practices
HP On Demand Webcast: Virtualization in Action
Go Parallel Video: Performance and Threading Tools for Game Developers
Rackspace Hosting Center: Customer Videos
Intel vPro Developer Virtual Bootcamp
HP Disaster-Proof Solutions eSeminar
HP On Demand Webcast: Discover the Benefits of Virtualization
MORE WEBCASTS, PODCASTS, AND VIDEOS
Downloads and eKits
Microsoft Download: Silverlight 2 Software Development Kit Beta 2
30-Day Trial: SPAMfighter Exchange Module
Red Gate Download: SQL Toolbelt
Iron Speed Designer Application Generator
Microsoft Download: Silverlight 2 Beta 2 Runtime
MORE DOWNLOADS, EKITS, AND FREE TRIALS
Tutorials and Demos
IBM IT Innovation Article: Green Servers Provide a Competitive Advantage
Microsoft Article: Expression Web 2 for PHP Developers--Simplify Your PHP Applications
Featured Algorithm: Intel Threading Building Blocks - parallel_reduce
MORE TUTORIALS, DEMOS AND STEP-BY-STEP GUIDES
webref The latest from WebReference.com Browse >
Controllers: Programming Application Logic - Part 2 · How to Use JavaScript to Validate Form Data · Controllers: Programming Application Logic
Sitemap · Experts · Tools · Services · Email a Colleague · Contact FREE Newsletters 
 The latest from internet.com
Sprint Launches Mobile WiMAX Network · Albatron Downsizes with the KI780G Mini-ITX Motherboard · Can't Find a Wi-Fi Network? Make Your Own.

 

Created: Oct. 20, 2000
Revised: Mar. 4, 2003

URL: http://webreference.com/internet/software/servers/http/compression/3.html