Java and XML Basics, Part 1 - Parsing Using JAXP and the DocumentBuilder
(Page 3 of 5 )
Using the steps described above, let’s have a closer look at what our code would look like (SimpleDOMParser.java). Disregarding the try/catch blocks and the comments, it’s just as simple as this:
DocumentBuilderFactoty factoryBuilder
= DocumentBuilderFactory.newInstance( );
DocumentBuilder builder = factoryBuilder.newDocumentBuilder();
Document doc = builder.parse( fileName );
As you can see, the process of creating a parser and the entire (sometimes ulgy) bunch of classes hidden behind each parser are transparent to the programmer, as we’re only working at the interface level and we’re not worried about the whole “behind the scenes” process.
Assuming that we launch this program against the following piece of XML (simple1.xml):
<?xml version="1.0" standalone="no"? >
<applog>
<session type="manual" date="12/12/2003">
<duration>01:00:00</duration>
<files>7</files>
<application>notepad.exe</application>
<comments>Started by the administrator to edit some config files.</comments>
</session>
<session type="automatic" date="13/12/2003">
<duration>00:10:00</duration>
<distance>37</distance>
<application>grep.exe</application>
<comments>Probably part of one of the maintenance scripts.</comments>
</session>
</applog >
We will obtain the following result:
java
-classpath "%CLASSPATH%;." SimpleDOMParser simple1.xml
Parsing successfull!
This means that we have instantiated our XML parser and parsed the XML document successfully. Of course, at this stage we will have to “take its word” for it, as we haven’t put in place any way of actually dumping the result Document on the screen; however, if we want to test our parser, we can change the XML document in such a way that is no longer a well-formed XML structured (
simple2.xml):
<?xml version="1.0" standalone="no"? >
<applog>
<session type="manual" date="12/12/2003">
<duration>01:00:00<duration>
<files>7</files>
<application>notepad.exe</application>
<comments>Started by the administrator to edit some config files.</comments>
</session>
<session type="automatic" date="13/12/2003">
<duration>00:10:00</duration>
<distance>37</distance>
<application>grep.exe</application>
<comments>Probably part of one of the maintenance scripts.</comments>
</session>
</applog >
If we run our program against it, we get the following:
Parsing error
:
org.xml.sax.SAXParseException: Expected "</duration>" to terminate element starting on line 5.
at org.apache.crimson.parser.Parser2.fatal(Unknown Source)
at org.apache.crimson.parser.Parser2.fatal(Unknown Source)
at org.apache.crimson.parser.Parser2.maybeElement(Unknown Source)
at org.apache.crimson.parser.Parser2.content(Unknown Source)
at org.apache.crimson.parser.Parser2.maybeElement(Unknown Source)
at org.apache.crimson.parser.Parser2.content(Unknown Source)
at org.apache.crimson.parser.Parser2.maybeElement(Unknown Source)
at org.apache.crimson.parser.Parser2.content(Unknown Source)
at org.apache.crimson.parser.Parser2.maybeElement(Unknown Source)
at org.apache.crimson.parser.Parser2.parseInternal(Unknown Source)
at org.apache.crimson.parser.Parser2.parse(Unknown Source)
at org.apache.crimson.parser.XMLReaderImpl.parse(Unknown Source)
at org.apache.crimson.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
at SimpleDOMParser.main(SimpleDOMParser.java:92)
Next: Traversing the DOM >>
More XML Articles
More By Liviu Tudor