21.4 Querying XML with XPath
XPath is a simple language that refers to elements, attributes, and text within an XML document. An XPath expression can refer to an XML element by its position in the document hierarchy or can select an element based on the value of (or simple presence of) an attribute. A full discussion of XPath is beyond the scope of this chapter, but Section 21.4.1 presents a simple XPath tutorial that explains common XPath expressions by example.
The W3C has drafted an API for selecting nodes in a DOM document tree using an XPath expression. Firefox and related browsers implement this W3C API using the
evaluate() method of the Document object (for both HTML and XML documents). Mozilla-based browsers also implement
Document.createExpression(), which compiles an XPath expression so that it can be efficiently evaluated multiple times.
IE provides XPath expression evaluation with the
selectNodes() methods of XML (but not HTML) Document and Element objects. Later in this section, you'll find example code that uses both the W3C and IE APIs.
If you wish to use XPath with other browsers, consider the open-source AJAXSLT project.
21.4.1 XPath Examples
If you understand the tree structure of a DOM document, it is easy to learn simple XPath expressions by example. In order to understand these examples, though, you must know that an XPath expression is evaluated in relation to some context node within the document. The simplest XPath expressions simply refer to children of the context node:
The "path" in the name XPath refers to the fact that the language treats levels in the XML element hierarchy like directories in a filesystem and uses the "/" character to separate levels of the hierarchy. Thus:
Note that contact/email evaluates to the set of <email> elements that are the second <email> child of any <contact> child of the context node. This is not the same as contact/email or (contact/email).
A dot (.) in an XPath expression refers to the context element. And a double-slash ( //) elides levels of the hierarchy, referring to any descendant instead of an immediate child. For example:
XPath expressions can refer to XML attributes as well as elements. The @ character is used as a prefix to identify an attribute name:
The value of an XML attribute can filter the set of elements returned by an XPath expression. For example:
To select the textual content of XML elements, use the
XPath is namespace-aware, and you can include namespace prefixes in your expressions://xsl:template // Select all <xsl:template> elements
When you evaluate an XPath expression that uses namespaces, you must, of course, provide a mapping of namespace prefixes to namespace URLs.
These examples are just a survey of common XPath usage patterns. XPath has other syntax and features not described here. One example is the
count() function, which returns the number of nodes in a set rather than returning the set itself: