21.2 Manipulating XML with the DOM API
The previous section showed a number of ways to obtain parsed XML data in the form of a Document object. The Document object is defined by the W3C DOM API and is much like the HTMLDocument object that is referred to by the document property of the web browser.
The following subsections explain some important differences between the HTML DOM and the XML DOM and then demonstrate how you can use the DOM API to extract data from an XML document and display that data to a user by dynamically creating nodes in the browser’s HTML document.
21.2.1 XML Versus HTML DOM
Probably the most important difference between the HTML and XML DOMs is that the getElementById() method is not typically useful with XML documents. In DOM Level 1, the method is actually HTML-specific, defined only by the HTMLDocument interface. In DOM Level 2, the method is moved up a level to the Document interface, but there is a catch. In XML documents, getElementById() searches for elements with the specified value of an attribute whose type is “id”. It is not sufficient to define an attribute named “id” on an element: the name of the attribute does not matter—only the type of the attribute. Attribute types are declared in the DTD of a document, and a document’s DTD is specified in the DOCTYPE declaration. XML documents used by web applications often have no DOCTYPE declaration specifying a DTD, and a call to getElementById() on such a document always returns null . Note that the getElementsByTagName() method of the Document and Element interfaces works fine for XML documents. (Later in the chapter, I’ll show you how to query an XML document using powerful XPath expressions; XPath can be used to retrieve elements based on the value of any attribute.)
Another difference between HTML and XML Document objects is that HTML documents have a body property that refers to the <body> tag within the document. For XML documents, only the documentElement property refers to the top-level element of the document. Note that this top-level element is also available through the childNodes property of the document, but it may not be the first or only element of that array because an XML document may also contain a DOCTYPE declaration, comments, and processing instructions at the top level.
There is also an important difference between the XML Element interface and the HTMLElement interface that extends it. In the HTML DOM, standard HTML attributes of an element are made available as properties of the HTMLElement interface. The src attribute of an <img> tag, for example, is available through the src property of the HTMLImageElement object that represents the <img> tag. This is not the case in the XML DOM: the Element interface has only a single tagName property. The attributes of an XML element must be explicitly queried and set with getAttribute() , setAttribute() , and related methods.
As a corollary, note that special attributes that are meaningful on any HTML element are meaningless on all XML elements. Recall that setting an attribute named “id” on an XML element does not mean that that element can be found with getElementById() . Similarly, you cannot style an XML element by setting its style attribute. Nor can you associate a CSS class with an XML element by setting its class attribute. All these attributes are HTML-specific.