Extreme HTML Optimization: URL Abbreviation | WebReference

Extreme HTML Optimization: URL Abbreviation

123456789

Extreme HTML Optimization

URL Abbreviation

One of the most effective techniques you can use to shrink your page is URL abbreviation using Apache's mod_rewrite module. First seen on Yahoo, the busiest page on the Web, URL abbreviation substitutes a short redirect URL (like "r/pg") for a longer one (like "programming/") using the mod_rewrite feature of Apache. This technique is especially effective for front pages which typically have a lot of links. We saved 5-6 K on our front page using abbreviated URLs.

To set up redirects first have your IT department install mod_rewrite on your Apache server. They'll need to edit one of your server config files. The following srm.conf commands show where to look for your rewritemap, and what the rewrite rule is:

RewriteMap      abbr            dbm:/apache/abbr_webref
RewriteRule     ^/r/([^/]*)/?(.*)   ${abbr:$1}$2    
[redirect=permanent,last]

Next enter the abbreviations you want separated by tabs in the above-referenced rewritemap file ("/apache/abbr_webref.txt" in our case). The following is a sample from our current redirects:

b	dlab/
d	dhtml/
g	graphics/
h	html/
p	perl/
x	xml/
3c	3d/lesson
dd	dhtml/dynomat/
ddh	dhtml/dynomat/hiermenus3/
dc	dhtml/column
gc	graphics/column
...
i2	index2.html
au	authoring/
in	internet/
iv	interviews/
mm	multimedia/
pg	programming/
pr	promotion/
th	tools/html/
tj	tools/javascript/
tl	tools/
tb	tools/browser/
tbj	tools/browser/javascript.html
hl	headlines/
hn	headlines/nh/
s	services/
sd	services/dns/
sf	http://forums.webdeveloper.com/
ss	scripts/
sg	services/graphics/
sr	services/reference/
...
ab	about.html
ns	new/submit.html
nc	new/contest.html
i	http://www.internet.com/
ic	http://www.internet.com/corporate/
...
is	http://www.internet.com/sections
isa	http://www.internet.com/sections/asp.html
isc	http://www.internet.com/sections/careers.html
isw	http://www.internet.com/sections/webdev.html
isd	http://www.internet.com/sections/downloads.html
isi	http://www.internet.com/sections/international.html
isx	http://www.internet.com/sections/linux.html
...
iswn	http://www.internet.com/sections/win.html
iswl	http://www.internet.com/sections/wireless.html
en	http://e-newsletters.internet.com
enm	http://e-newsletters.internet.com/mailinglists.html
icl	http://www.internet.com/corporate/legal.html
icp	http://www.internet.com/corporate/privacy/privacypolicy.html
ert	http://www.earthweb.com/
fkt	http://www.flashkit.com/
...

So to link to our privacy policy all I have to do is now type <A HREF="/icp">Privacy Policy</A> saving beaucoup bytes. Notice that I use shorter abbreviations/redirects for the more frequently used URLs, like our experts ("r/d" = "/dhtml/"). I save even more space with the tutorial abbrevs that automatically append the column number after the fragment URL thus ("r/dc/48" = "/dhtml/column48/"). Major directories are two characters, and start with the same letter when possible ("s" for services, "h" for headlines etc.). Internet.com links start with an "i" and other sites all have three-letter abbreviations for consistency.

To add more abbrevs you just edit the rewritemap file, and run something like:

"create_dbm abbr_webref abbr_webref.txt"

in the "/www/misc/redir/" directory. Remember to use TABS to separate the two fields, and test before deploying. Yahoo uses redirects like this on their home page to great effect (load time is nearly instantaneous). This technique alone saves us 5-6 bytes off the 24K hand-optimized home page.

Quotes

Some quoted attributes are optional in HTML 4.01, and can be safely ommitted from your HTML. Note that XHTML requires all attributes be quoted, and complete (checked="checked" etc.). Attributes must be quoted if they contain any character other than letters (A-Za-z), digits, hyphens, and periods. So you can do this:

<IMG SRC="t.gif" WIDTH=1 HEIGHT=1>

but not this:

<TABLE WIDTH=100%>

While not technically valid HTML, quotes can be ommitted from A HREFs like this:

<a href=r/pg></a>

The links still work on every browser we tried, and we actually got the front page down to the 13K range using this technique. Here's an example. However, for XHTML all attributes and URLs must be quoted, so we recently switched to quoting URLs to have a valid page (sans the ad code's ampersands).

123456789

http://www.internet.com
Comments are welcome
Created: Jan. 10, 2000
Revised: Mar. 19, 2001
URL: http://webreference.com/authoring/languages/html/optimize/part2.html