The Art & Science of Web Design | 8 | WebReference

The Art & Science of Web Design | 8

To page 1current pageTo page 3To page 4
[previous] [next]

The Art & Science of Web Design


Let's start with HTML as our basis for discussing structure. We've already seen where it came from-humble beginnings in early database systems and its evolution through SGML. And we've seen why its goals of simplicity and forgiveness made it so rapidly popular. But how can something so pervasive come from something so simple?

The answer lies in the basic building block of the Web: text. As far back as you look in the history of the Web, plain old text has been the lingua franca. I'm referring to the simple .txt files on your computer-like the READMEs that come with new software (also, as a matter of fact, the format of the HTML files we use to build our Web sites). But now, with all our modern applications and emphasis on graphics and visuals, isn't text outdated? For example:

So why the emphasis on text? Again, there are a few reasons:

Thus, the fact that HTML is derived from plain text means that it inherits all the computer-enabled benefits of ASCII. Computers can manipulate the text. We can create programs to do all sorts of wonderful things to our content: We can index it and search it, we can translate it into other languages, and we can copy and paste it. The possibilities are, quite literally, endless.

None of these things are possible when you leave text behind. In traditional print design, for example, it is not uncommon to take text from a layout program like QuarkXPress and drop it into a graphics application like Photoshop. By turning the text into a graphic, designers can manipulate it all they want to achieve the desired effect. They can stretch and rotate and embellish until a headline or drop cap is perfect, and then import it back into their documents. But what if we do this on the Web? The words in the headline, as a graphic, lose their meaning. The computer can no longer distinguish them as words-it sees only a graphic. The machine-readable benefits of text are gone.

With a foundation of plain text, HTML takes it a step further into structured text. If machine readability is an admirable goal, then structure applied to simple text is the proverbial Holy Grail. Think about it: If a computer can process a file, adding structure by means of tags can provide clues to what that text actually means. For example, take the following bit of text:

The story was about Microsoft and Bill Gates.

What can a computer do with the line above? Well, as we've seen, it can do any number of transformations. It can be spell-checked, searched, translated, converted to capital letters, or printed in green. But consider the following:

<p>The story was about 
<company website="<a href=""">"</a> 
symbol="MSFT">Microsoft</company> and 
<person title="President" 
employer="Microsoft">Bill Gates</person>.</p>

Now consider how easy it would be to programmatically manipulate the text. Not only can I do all the things we could do to the previous example, but I can add even more value. I can look up the current stock price of the company mentioned. I can build a link to the company's home page on the Web. I can link to any biographical information I may have on Mr. Gates. I can search this text, and any other text we have, and aggregate all the officers of public companies. And the list goes on and on.

We've just added a very powerful feature to our text-something called metadata, or information about information. The metadata in the tags is not intended to be displayed as part of the sentence but rather as embellishment and annotation of the sentence. It is adding value. It is allowing us to reference parts of our content.

These are structural tags. They talk about the semantics of a document and add metadata so that we can manipulate our content. Others, purely presentational tags, offer none of these benefits. Think for a second, about the difference between these two examples:

The story was about <b>Microsoft</b> and Bill Gates.


The story was about <company>Microsoft</company> _and Bill Gates.

Which is more valuable? Obviously, the second allows us far more opportunity to disambiguate the content. The <b> tag may render the company's name in boldface type, but it tells us nothing about the content. The <company> tag, on the other hand, gives us a clear idea of what is being referenced, but says nothing about how our browser should display the word. Wouldn't it be great if we could get the best of both worlds, adding rich metadata while maintaining control of the visual presentation?

Luckily, that is exactly how HTML was designed.

Translating the Web with Babelfish

It can be tempting to bypass the limitations of HTML for the visually stunning impact of graphics. By imprisoning parts of your pages as graphics, you can achieve a variety of effects beyond the rather rudimentary capabilities of today's browsers. Headlines can come alive in any typeface you desire. Text can rotate and show off drop shadows, and on and on and on.

But is it really such a good idea?

For a perfectly clear example of the power of text, we can turn to the Alta Vista Search Engine. One of the interesting features the service offers is the capability to translate Web pages into other languages. Thus, if you find an interesting looking page written in Spanish (and you don't happen to habla Español), you can let the Babelfish translator convert it to English.


That is, if the page is actually still text. The engine can't get to the words found in graphics, so all those fancy headlines are going to stay elusive. Bummer, considering that's often the most important content on the page. And those sites that create their content as a graphic or Flash animation? Well, you're completely out of luck. The Alta Vista translation service, Babelfish, will convert Web pages between a number of different languages... if it can read them.

To page 1current pageTo page 3To page 4
[previous] [next]

Created: April 5, 2001
Revised: April 5, 2001