XML Parser Comparison (1/2) - exploring XML | WebReference

XML Parser Comparison (1/2) - exploring XML

XML Parser Comparison

In order to process XML data every program or server process needs an XML parser. The parser extracts the actual data out of the textual representation and creates either events or new data structures from them. Parsers also check whether documents conform to the XML standard and have a correct structure. This is essential for the automatic processing of XML documents.

Correct structure comes in two flavors: XML-conforming and DTD/Schema-conforming. XML-conforming, so-called "well-formed" documents fulfill the XML standard, most importantly by having properly nested tags. Document structure can be defined in the traditional way using Document Type Definitions (DTDs) or the new XML Schema proposal as outlined in my previous column. A so-called "validating" parser checks a document not only for conformance to the general XML rules but also enforces a certain DTD or Schema, checking whether all necesary elements are present and if their order is as specified in the DTD or Schema. Not all parsers on the market are validating, although most of the more recent implementations are. Support for XML Schema is naturally less prevalent as the specification is not yet a final recommendation and some software manufacturers got burnt by early implementations.

With the growing number of existing XML vocabularies, tag names in different DTDs are more and more likely to clash, so a facility for using multiple DTDs for an XML document was needed. XML Namespaces is such a facility that enables the direct association of tag names with a certain DTD and therefore the combination of multiple DTDs. Ambiguities are eliminated by distinguishing between say <person:name> and <product:name>. This also eases the reuse of DTDs.

XML, as opposed to HTML, does not contain any information about the layout of its data. This can be defined for instance using stylesheets defined in Cascading Stylesheet Support (CSS) or Extensible Stylesheet Language (XSL). This is a favorable trait of XML, as it allows for adapting the same information to different browsers or clients using different transformations. The transformation can be achieved using XSL-T, for details see column 17. Not all XML parsers have integrated support for XSL-T, though.

Let's look at the candidates.


Produced by Michael Claßen
All Rights Reserved. Legal Notices.

URL: http://www.webreference.com/xml/column22/index.html
Created: Oct 20, 2000
Revised: Oct 20, 2000