Manipulating XML Data with JavaScript - 21.4 Querying XML with XPath
(Page 4 of 5 )
XPath is a simple language that refers to elements, attributes, and text within an XML document. An XPath expression can refer to an XML element by its position in the document hierarchy or can select an element based on the value of (or simple presence of) an attribute. A full discussion of XPath is beyond the scope of this chapter, but Section 21.4.1 presents a simple XPath tutorial that explains common XPath expressions by example.
The W3C has drafted an API for selecting nodes in a DOM document tree using an XPath expression. Firefox and related browsers implement this W3C API using the evaluate() method of the Document object (for both HTML and XML documents). Mozilla-based browsers also implement Document.createExpression() , which compiles an XPath expression so that it can be efficiently evaluated multiple times.
IE provides XPath expression evaluation with the selectSingleNode() and selectNodes() methods of XML (but not HTML) Document and Element objects. Later in this section, you’ll find example code that uses both the W3C and IE APIs.
If you wish to use XPath with other browsers, consider the open-source AJAXSLT project at http://goog-ajaxslt.sourceforge.net.
21.4.1 X Path Examples
If you understand the tree structure of a DOM document, it is easy to learn simple XPath expressions by example. In order to understand these examples, though, you must know that an XPath expression is evaluated in relation to some context node within the document. The simplest XPath expressions simply refer to children of the context node:
contact // The set of all <contact> tags beneath the context nod e
contact[1] // The first <contact> tag beneath the context
contact[last()] // The last <contact> child of the context node
contact[last()-1] // The penultimate <contact> child of the context node
Note that XPath array syntax uses 1-based arrays instead of JavaScript-style 0-based arrays.
The “path” in the name XPath refers to the fact that the language treats levels in the XML element hierarchy like directories in a filesystem and uses the “/” character to separate levels of the hierarchy. Thus:
contact/email // All <email> children of <contact> children of context
/contacts // The <contacts> child of the document root (leading /)
contact[1]/email // The <email> children of the first <contact> child
contact/email[2] // The 2nd <email> child of any <contact> child of context
Note that contact/email[2] evaluates to the set of <email> elements that are the sec ond <email> child of any <contact> child of the context node. This is not the same as contact[2]/email or (contact/email)[2] .
A dot ( . ) in an XPath expression refers to the context element. And a double-slash ( // ) elides levels of the hierarchy, referring to any descendant instead of an immediate child. For example:
.//email // All <email> descendants of the context
//email // All <email> tags in the document (note leading slash)
XPath expressions can refer to XML attributes as well as elements. The @ character is used as a prefix to identify an attribute name:
@id // The value of the id attribute of the context node
contact/@name // The values of the name attributes of <contact> children
The value of an XML attribute can filter the set of elements returned by an XPath expression. For example:
contact[@personal="true"] // All <contact> tags with attribute personal="true"
To select the textual content of XML elements, use the text() method:
contact/email/text() // The text nodes within <email> tags
//text() // All text nodes in the document
XPath is namespace-aware, and you can include namespace prefixes in your expressions:
//xsl:template // Select all <xsl:template> elements
When you evaluate an XPath expression that uses namespaces, you must, of course, provide a mapping of namespace prefixes to namespace URLs.
These examples are just a survey of common XPath usage patterns. XPath has other syntax and features not described here. One example is the count() function, which returns the number of nodes in a set rather than returning the set itself:
count(//email) // The number of <email> elements in the document
Next: 21.4.2 Evaluating XPath Expressions >>
More JavaScript Articles
More By O'Reilly Media
|
This article is excerpted from chapter 21 of JavaScript: The Definitive Guide, Fifth Edition, written by David Flanagan (O'Reilly; ISBN: 0596101996). Check it out today at your favorite bookstore. Buy this book now.
|
|