Navigating Input Documents Using Paths (Page 1 of 4 )
If you want to learn how to extract information from XML documents, you'll want to read this three-part series. It covers path expressions. This article is excerpted from chapter four of the book
XQuery, written by Priscilla Walmsley (O'Reilly, 2007; ISBN: 0596006349). Copyright © 2007 O'Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O'Reilly Media.
Path expressions are used to navigate input documents to select elements and attributes of interest. This chapter explains how to use path expressions to select elements and attributes from an input document and apply predicates to filter those results. It also covers the different methods of accessing input documents.
Path Expressions
A path expression is made up of one or more steps that are separated by a slash (/) or double slashes (//). For example, the path:
doc("catalog.xml")/catalog/product
selects all the product children of the catalog element in the catalog.xml document. Table 4-1 shows some other simple path expressions.
Table 4-1. Simple path expressions
Example | Explanation |
doc("catalog.xml")/catalog | The catalog element that is the outermost element of the document |
doc("catalog.xml")//product | All productelements anywhere in the document |
doc("catalog.xml")//product/@dept | All deptattributes of productelements in the document |
doc("catalog.xml")/catalog/* | All child elements of the catalogelement |
doc("catalog.xml")/catalog/*/number | All numberelements that are grandchildren of the catalogelement |
Path expressions return nodes in document order. This means that the examples in Table 4-1 return the product elements in the same order that they appear in the catalog.xml document. More information on document order and on sorting results differently can be found in Chapter 7.
Path Expressions and ContextA path expression is always evaluated relative to a particular context item, which serves as the starting point for the relative path. Some path expressions start with a step that sets the context item, as in:
doc("catalog.xml")/catalog/product/number
The function call doc("catalog.xml") returns the document node of the catalog.xml document, which becomes the context item. When the context item is a node (as opposed to an atomic value), it is called the context node. The rest of the path is evaluated relative to it. Another example is:
$catalog/product/number
where the value of the variable $catalog sets the context. The variable must select zero, one or more nodes, which become the context nodes for the rest of the expression.
A path expression can also be relative. For example, it can also simply start with a name, as in:
product/number
This means that the path expression will be evaluated relative to the current context node, which must have been previously determined outside the expression. It may have been set by the processor outside the scope of the query, or in an outer expression.
Next: Steps and changing context >>
More XML Articles
More By O'Reilly Media