HTTP for HTML Authors, Part III - HTML with Style | WebReference

HTTP for HTML Authors, Part III - HTML with Style

index1234567

HTTP for HTML Authors, Part III

Zen and the art of URLs

When I first introduced URLs back in Tutorial 2, I mentioned that you should strive to make URLs as small and descriptive as possible. A good URL is short, to the point, doesn't contain too many non-alphanumeric characters, and is at least partially understandable by a human. To give you an example, compare these two URLs:

http://www.acme.com/cgi-bin/hermes.pl;amsesskey=88d89d87g6g89saedcd8csd89?refid=98140034556.html&action=display&category=PR
http://www.acme.com/news/pressreleases/1998/june/14/morons13release

You can imagine both being URLs that point to a press release from Acme Computer Corporation. However, the second one has some distinct advantages over the first one.

First of all, it's shorter. If I want to type this into an e-mail message and send it to a friend, or even scribble it on a napkin, it's possible. Also, even though the difference between the two URLs is only 72 bytes, if you multiply that by the number of visitors you get asking for it and the number of times you insert it into HTML pages, it could mean that it causes more than just a tiny blip on your bandwidth radar.

Speaking of which, the second URL is also easier to understand by those bipedal mammals that might occasionally visit your site. It is relatively obvious that this refers to the press release made on June 14, 1998 regarding the release of MORONS 1.3, in the news section of the Acme Web site. Some of this information is also there in the first URL, but it's not as obvious. This has two advantages: first of all, it's easier to remember, so that people can spread the word or come back to this even if they haven't bookmarked the page in their browser. Secondly, developers working on this site as well as users transcribing this URL from that beer-stained napkin can more easily spot errors and fix them.

Looking at the less human side of things again, the second URL doesn't have to change if the technologies you use on your site change. For instance, the first URL makes it blindingly obvious to anyone into that sort of thing that it is produced by a script written in the Perl programming language (notice the .pl suffix) which someone calls “Hermes” running as a CGI binary on the Web server (the /cgi-bin/ directory is a dead giveaway). If your company decides to drop Perl and CGI and opt for Java Servlets and JavaServer Pages instead, you would probably have to change this URL to reflect this.

Finally, and most importantly, the second URL is more or less guaranteed to always point to the same resource as long as it is available. The good thing about the second URL is that it won't change. There's room for more sections parallel to this one (notice the /news/pressreleases/ directory, which allows you to add /news/inthepress/ or /news/newsletter/ later on), the URL won't change if the ordering is changed on the page (the press release has a symbolic name, morons13release, instead of a number, so if you decide to insert something before it it won't change), there's nothing that depends on the implementation (no directory names, filename suffixes or weird codes, keys and serial numbers). This means that when the user comes back to this URL after a very, very long time, he won't be faced with the dreaded “404 - Not Found” message and be unable to find the press release.

index1234567

Next Page...

http://www.internet.com/

URL: http://www.webreference.com/html/tutorial30/1.html

Produced by Stephanos Piperoglou
Created: March 15, 2001
Revised: March 16, 2001