|
rom
SGML's point of view, a document is a hierarchical structure of
nested elements (chapters, sections, paragraphs, and so on).
SGML has no means---and was not intended to have---for specifying
any presentational aspects of these elements. However, strictly
speaking, SGML cannot tell you about the meaning or role of any
element, either. This information is implied by the creator of an SGML
application and is usually provided in comments or in the
documentation accompanying the formal specification.
SGML realizes the maxim of Wittgenstein, who said, "The meaning of a
word is its use." In SGML, the only information that can be formally
communicated about an element is in what contexts and levels of
document hierarchy it can or must occur. This means that you cannot
build an interpreter that could apply a meaningful formatting to a
document based only on its SGML markup. However, the purely formal
dissection that SGML performs on a document is still surprisingly
useful in many situations.
All documents that can be marked up with the same hierarchy of
elements are said to belong to a certain document type.
Rather than describe a set of tools to mark up documents, SGML
defines the structure of a particular type of documents via what is
called document type definition (DTD). A part of this chapter is devoted to analyzing the
DTD for one particular SGML application, HTML version 4.0
(code-named Cougar). Besides (and before) the DTD, some
general features of an SGML application are specified in another
formal construct called the SGML declaration, which is detailed in
the next section.
As for SGML syntax, suffice it to say beforehand that it is pretty
close to the syntax of HTML. You will see that SGML statements, like
HTML tags, are enclosed in angle brackets (<>) and
contain a keyword or name followed by one or more parameters separated
by spaces. The only consistent difference is that SGML statements
commonly have the ! character inserted between the open
delimiter < and the statement keyword, for example:
<!ELEMENT IMG - O EMPTY -- Embedded image -->
You must already be familiar with one type of statement that uses the
<! syntax, namely comments in HTML documents that are
enclosed in <!-- and -->. That's because
the comment syntax of HTML is directly borrowed from SGML, where
everything within a <! statement enclosed in double
hyphens (--) is ignored by the SGML parser. For example,
the words Embedded image in the preceding code line are
intended as a comment for human readers only.
One more <!-type declaration that needs to appear
in HTML files is DOCTYPE, discussed briefly later in this chapter.
|