|
s
with every new and promising technology, it is probably more
important to explain what it isn't rather than what it is.
XML, just like SGML, is not a page layout or graphics
language. By itself, XML provides even fewer presentation
tools than you have with HTML. Strictly speaking, it's not
even a markup language, but rather a system making it possible to
build such languages to match any conceivable document type.
Chapter 3, "SGML and the HTML DTD,"
explains the HTML document type definition (DTD), the specification
of HTML tags and document structure written in SGML.
Similarly, with XML you can build a DTD that exactly matches the
structure of your document and introduces a set of self-explanatory,
logically organized tags and attributes fine-tuned for your markup
needs.
By attaching the DTD with the document sent over the network, you can
ensure that the XML software reading the document can parse it
correctly and thereby guarantee its correct formatting, conversion,
adding to a database, or whatever the receiver will choose to do with
the document. In short, with XML you can create your own HTML, or
XYZML, or Whatever-You-Like-ML! (No surprise such a language was
called "extensible" in the first place.)
It is important to understand that XML isn't better than HTML because
it makes it easier to change fonts or position images. In the visual
presentation realm, XML is nothing better than HTML (some might say
it's worse because it lacks all those neat Netscape
enhancements---unless you've defined them in your DTD). It was
the intention of the creators of the language that the visual
presentation of an XML document can be (optionally) specified by an
attached style sheet, which is an external mechanism for XML
just as it is for HTML.
XML's visualization power is thus completely determined by the style
sheet language you use---for example, Cascading Style Sheets (CSS) or
Document Style Semantics and Specification Language (DSSSL)---and if
you don't care about logical markup you can achieve exactly the same
visible results by using this chosen style sheet mechanism with
HTML. (Remember that you can use the neutral SPAN tag in
HTML to apply any attributes, style names included, to arbitrary
fragments of text.) It is when the proper internal structure of your
data really matters that XML easily outshines HTML.
The XML specification
defines the language in the terms of behavior of a parser,
which is a piece of software whose sole purpose is to understand the
element structure of your document and break it down into nested
elements in accordance with the DTD. Another program (termed
simply "application" in the XML spec) is supposed to obtain
the document thus dissected from the parser and process it
further. Exactly what the application performs on the document
is outside the scope of XML; for instance, it may be a browser that
displays the document using an appropriate style sheet.
XML being a subset of SGML, an XML document is almost always a valid
SGML document; there are small discrepancies between these two
languages that are likely to be eliminated soon with the acceptance of
certain amendments to SGML standard. The relation between XML and HTML
is more complex. With the capability to define new tags, XML documents
are not likely to count as valid HTML very often; on the other hand,
an HTML file is relatively easy to make XML-conformant on one of the
two levels of conformance (described later), depending
on whether you provide a DTD for your document or not.
I don't attempt a real tutorial of XML in this chapter for two
reasons. First, one chapter's space is surely insufficient to cover
even the basics of the language, and second, the language itself is so
young and unstable that it is probably untimely to start teaching it
in a serious fashion. (A quote from the language specification:
"Please be advised that the draft you are now reading is unusually
volatile.") Instead, this chapter presents a couple of small examples
that will help you to quickly grasp the "look and feel" of the
language.
In a sense, XML is positioned somewhere in between SGML and HTML, with
the intent of its creators being to combine the best features of these
two languages. However, XML is much closer to SGML than to HTML, and
although knowledge of HTML will help you understand the most obvious
XML features, an acquaintance with SGML syntax and ideology would be
of much better help. So I recommend that you brush up what you
remember from reading Chapter 3 before
proceeding to subsequent sections of this chapter.
|