HTML for the Lazy: Omitting Implied Tags | WebReference

HTML for the Lazy: Omitting Implied Tags

Front Page12345

HTML for the Lazy: Omitting Implied Tags

So far, I've talked about what elements are and how an HTML document is made up of them. As you might remember from our first tutorial, all documents have an HTML element, which always contains a HEAD and a BODY element. You probably think this is relatively wasteful, and if you do, you're absolutely right. Since the HTML element is always there, you don't have to include its start-tag and end-tag in your document. All user agents reading your document will expect it to be there. And they'll assume it's there even if you don't include the tags.

So, you can omit a lot of elements that can be implied. The important thing to remember here, however, is that the elements are still there. The reason you can leave their tags out of the document source is that they are always there, so you don't have to bother typing them in.

Let me explain this a bit further. When a user agent reads an HTML document, it creates a tree of elements. At the top of the tree is the HTML element. This branches out to two sub-trees, the HEAD and BODY elements. Take, for example, Acme's first page from the first tutorial. Here it is:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"
 "http://www.w3.org/TR/REC-html40/strict.dtd">
<HTML>
 <HEAD>
  <TITLE>Acme Computer Corp.: Who We Are</TITLE>
 </HEAD>
 <BODY>
<H1>Acme Computer Corp.</H1>
<P>Acme Computer Corporation is a technology-based
company that seeks to offer its customers the 
latest in technological innovation. Our products
are created using the latest breakthroughs in
computers and are designed by a team of top-notch
experts.</P>
<P>We are based in Acmetown, USA, and have offices in 
most major cities around the world. Our goal is to
have a global approach to the future of computing.
Have a look at our product catalog for some 
examples of our innovative approach.</P>
 </BODY>
</HTML>

This document can be considered as a tree of elements and text that looks like this:

Tree of HTML Elements

Now the top part of this tree (the HTML, HEAD and BODY elements) is always present in every document. So we can just omit these elements if we don't want to bother. This is what results:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"
 "http://www.w3.org/TR/REC-html40/strict.dtd">
<TITLE>Acme Computer Corp.: Who We Are</TITLE>
<H1>Acme Computer Corp.</H1>
<P>Acme Computer Corporation is a technology-based
company that seeks to offer its customers the 
latest in technological innovation. Our products
are created using the latest breakthroughs in
computers and are designed by a team of top-notch
experts.</P>
<P>We are based in Acmetown, USA, and have offices in 
most major cities around the world. Our goal is to
have a global approach to the future of computing.
Have a look at our product catalog for some 
examples of our innovative approach.</P>

Now let's see who's been paying attention, class: what is the tree of elements that corresponds to the above document? Is it a much simpler tree? No. In fact, it is the exact same tree. Although the start- and end-tags of the HTML, HEAD and BODY elements have been omitted, the elements are still there. Be careful not to confuse tags with elements. A tag is inserted into a document to mark the beginning or end of an element, but it can be omitted when the presence of the element is implicit from the context. The two documents above are not, in fact, two separate documents, but are one and the same. I just left out the obvious bits the second time around. Because I'm lazy.

Since we're in a lazy mood, let's ask some questions that must spring to mind: Why type in these tags in the first place? In fact, since you can omit the tags, why have the elements?

The answer to the first question is rather simple. Firstly, we might want to give these elements some attributes. In this case, we have to include at least the start-tag. Secondly, it usually helps to include the tags just as a reminder that the elements are there. In the end, it boils down to whether or not you're willing to believe in the existence of something you can't see with your own eyes. I'm not, so I make a habit of including implied tags. But in the end it's really a matter of faith.

The answer to the second question should also be obvious if you've been getting into the spirit of things. Elements are there to help organize the document in a logical fashion. The HTML element is the all-encompassing element, the HEAD element contains information about the document, and the BODY element contains the document itself. This is a useful distinction and makes things easier to organize. It would be a lot messier if, for instance, you had a load of LINK elements sprinkled around your document. By organizing them and putting them all in the HEAD element, the document makes more sense.

The element descriptions you'll see in the tutorials from now on will list whether you can omit the start-tag and / or end-tag of the element. This depends entirely on the element.

Those of you who, like me, are truly into laziness might have noticed that "and / or" bit in the previous paragraph. Yes, it's true, you can omit even more tags.

The placement of elements in a document is strictly defined in HTML. We already mentioned that the LINK and TITLE elements, for instance, can only appear within a HEAD element.

Sometimes, an element's start-tag can be placed after the start-tag of another element so that it is implicit that the first element has ended. In this case the first element's end-tag is optional. This sounds confusing, but look at our example. There are two P elements there. In HTML, you cannot have a paragraph element within another paragraph element, so when the second paragraph starts, the first paragraph must end. The definition of the P element states that the end-tag is optional, so we can omit it in this case. This is what results:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"
 "http://www.w3.org/TR/REC-html40/strict.dtd">
<TITLE>Acme Computer Corp.: Who We Are</TITLE>
<H1>Acme Computer Corp.</H1>
<P>Acme Computer Corporation is a technology-based
company that seeks to offer its customers the 
latest in technological innovation. Our products
are created using the latest breakthroughs in
computers and are designed by a team of top-notch
experts.
<P>We are based in Acmetown, USA, and have offices in 
most major cities around the world. Our goal is to
have a global approach to the future of computing.
Have a look at our product catalog for some 
examples of our innovative approach.

Notice that I have also omitted the end-tag to the second paragraph, because it too can be implied. Since the document ends, the P element must also end, so its end-tag is implied.

Note that the document is still the same. We have only omitted a few tags because they can be implied, but the elements are still the same. The first <P> tag denotes the beginning of a P element. The second <P> tag denotes the beginning of a new P element, which means that the previous element must end here.

Omitting an element end-tag like this is very different to having empty elements, which we have discussed previously. LINK is an example of an empty element. LINK elements do not have end-tags because they begin and end in the same place, while the P elements above do not have end-tags because they can be implied from the context. The two paragraphs above are not empty elements. They are proper elements that contain the entire paragraph text, I have just left out their end-tags because I'm lazy.

To summarise:

  1. If an element states in its definition that its start-tag and end-tag are optional, then they can be omitted if the element's presence can be implied by the context
  2. If an element states in its definition that its end-tag is optional, its end-tag can be omitted if it is followed by an element that it is not allowed to contain.

I would recommend that you always include start- and end-tags in your document even if they can be omitted. The first reason is that it illustrates the structure of the document much better, and you're just being lazy if you don't include them. The second and more important reason is that some user agents (most notably Netscape Navigator and Microsoft Internet Explorer, on all versions and all platforms they are available on) process elements erroneously if their tags are omitted. I will give specific examples of this in the future, but for the moment the best tactic is to include all tags, even when they can be implied. The reason I spent so much time discussing omitted tags is that because some user agents process them incorrectly, you might be led to believe that leaving out tags is equivalent to leaving out the elements. This is not so.

But since we mentioned that some elements are allowed within others while others are not, let's see what types of elements we can have in HTML.

Front Page12345

http://www.internet.com

Produced by Stephanos Piperoglou

All Rights Reserved. Legal Notices.

URL: http://www.webreference.com/html/tutorial3/1.html
Created: June 25, 1998
Revised: June 25, 1998