WebReference.com - Excerpt from Inside XSLT, Chapter 2, Part 1 (4/4)
Each node has a number of set properties associated with it in XSLT, and the following list includes the kinds of properties that the writers of XSLT processors keep track of for each node:
- name. The name of the node.
- string-value. The text of the node.
- base-URI. The node's base URI (the XML version of an URL).
- child. A list of child nodes; null if there are no children.
- parent. The node's parent node.
- has-attribute. Specifies an element node's attributes if it has any.
- has-namespace. Specifies an element node's namespace nodes.
There's another consideration to take into account when working with trees: XSLT processors are built on top of XML parsers, and the rules for XML parsers and XSLT processors are slightly different, which can lead to problems. This issue can become important in some cases, so the following section discusses it briefly.
The Information Set Model Versus the XSLT Tree Model
XML parsers pass on only certain information, as dictated by the core XML Information Set specification, which you can find at www.w3.org/TR/xml-infoset (see New Rider's Inside XML for more information on XML Information Sets), whereas XSLT processors adhere to the XSLT tree model. These models, and what they consider important, are different, which can lead to problems.
For example, two XML items that are part of the core information set but are not available in XSLT: notations and skipped entity references (entity references that the XML parser has chosen not to expand). In practice, this means that even if the XML parser passes on information about these items, the XSLT processor can't do anything with it. However, notations are rarely used, and very few XML parsers generate skipped entity references, so this is not a significant problem.
On the other hand, XML parsers can strip comments out of XML documents, which is something you should know about, because the XSLT model is supposed to include them.
In addition, DTD information is not passed on from the XML parser to the XSLT
processor (perhaps because W3C is planning more widespread use of XML schemas in XSLT 2.0,
although there's still no official mechanism to connect XML schemas with XML documents yet).
That's not usually a problem, because it's up to the XML parser to validate the XML document,
except in one case: when an attribute is declared of type ID. In XML, you can declare an
attribute with any name to be of type ID, so the XSLT processor has no idea which attributes
are of this type unless the processor has access to the DTD. This is important when you're
using stylesheets that are embedded in XML documents, because then the XSLT processor
needs to be able to know which element in the document holds the stylesheet you want to use
to transform the document. In this case, some XSLT processors, like Saxon, exceed the XSLT
recommendation and scan the DTD, if there is one, to see which attributes are of type
There are a few more items that you also might want to know about. For example, the XSLT processing model makes namespace prefixes available in the input tree, but it gives you very little control over them in the output tree, where they are handled automatically. Also, the XSLT processing model defines a base URI for every node in a tree, which is the URI of the external entity from which the node was derived. (In the XSLT 1.1 working draft, that's been extended to support the XML, that's been extended to support the XML Base specification, as you'll see near the end of this chapter.) However, in the XML information set, base URIs are considered peripheral, which means that the XML parser may not pass that information on to the XSL processor.
All in all, you should know that XSLT processors use XML parsers to read XML documents, and that the junction between those packages is not a seamless one. If you find you're missing some necessary information in an XSLT transformation, that's something to bear in mind. In fact, the differences between the XML infoset and XSLT tree model is one of the areas that XSLT 2.0 is supposed to address. Among other things, XSLT 2.0 is supposed to make it easier to recover ID and key information from the source document, as well as to recover information from the source document's XML declaration, such as XML version and encoding.
Created: September 12, 2001
Revised: September 12, 2001