Charting the XML territory with XMLMap (1/2) - exploring XML
Charting the XML territory with XMLMap
After three years of exploring the XML jungle, the time has come to add another type of map to this area. While the Column Trailmap gives an overview of the columns written for the XML section of WebReference, we will, over the next couple of installments, create a map of XML standards.
It will clearly be impossible to cover every single XML vocabulary ever invented, but a walktrough organized by topics should include the most relevant standards. Furthermore, exhaustive lists such as the Cover Pages are available elsewhere. We will organize the XMLMap into four groups:
- Fundamentals: The basic XML infrastructure underlying all vocabularies.
- Documents: General-purpose XML document standards.
- Messages: Vocabularies used in communications between systems.
- Communities: Special interest group sponsored formats.
In loose order we will expand the map into these different areas, and explore various aspects of them. Today, we'll start with the fundamentals of XML. Early on, the W3C attempted to lay the groundwork for XML for maximum compatibility and interoperability: with mixed results, as we shall see.
Underlying everything is the specification of XML itself, with its definitions of well-formed and valid XML data, as well as clarification of character set issues. The initial 1.0 specification can be considerd a success, as only recently a 1.1 specification has been released, elaborating on some more character set issues that were appearing. No substantial changes have been made. They would be virtually impossible to make anyhow, as everything based on core XML would be affected.
The definition of XML Namespaces was the first challenge to the W3C and its standards creation process. Although the need to mix different XML vocabularies in one document was detected early, the fix was not released until much later in the form of XML Namespaces. While the solution is obvious, namely to prefix tags with an acronym that gets mapped to a globally unique ID, this caused widespread effects rippling through the other fundamental standards. While some, such as the XML linking effort, used them to their advantage, others still have not coped, like XML schemas, where the effect of mixing different declarations is still unclear, at least to me.
With the roots of XML in HTML, the next areas of XML infrastructure were fairly clear:
- Linking of documents
- Styling of documents
- Definitons of XML Schemas and their validation
Linking of documents
Since one of the most important features of HTML is the hypertext, i.e. the linking between disparate
texts, the W3C was quick to establish XLink as an effort to mimic these capabilities and fix the worst
problem of the Web, the broken link. So XLink came up with various types of linking, most prevalent the
href, inheriting the broken link problem. An alternative fixing the problem is
the external link that resides in a separate document, external to both the source and the target
document. Whereas this is a neat idea, it creates so many implementation and management problems that
I have yet to see it implemented, especially on a system lacking central control such as the World-Wide
With foresight that was amazingly absent in the styling department (next page), a separate effort was
created for the more complex referencing of parts of an XML document, by the name of
XLink only supports the XML
name attributes, XPointer allows
referencing by name, position and tag type. Here too, a complexity was created that hinders widespread
implementation to date. Some observers proclaimed a clash between the camps of high-priced niche SGML
and mass-market mainstream HTML.
XPath, yet another way to specify document parts, or more precisely nodes, grew out of XSL's
(see styling department, next page) need to identify targets of XML document manipulation. While CSS is
limited to referencing a node's
class, XPath allows XSL to specify
things like odd vs. even row in a list. XPath processors are implemented in XSLT processors.
Next are styles and schemas...
Produced by Michael Claßen
Created: Dec 09, 2002
Revised: Dec 09, 2002