In a previous article (XML Basics Part One);we had a brief look at XML. However, as stated in the article, XML itself is worth nothing without the set of APIs that are generated-it would simply be just another fancy form of CSV, that is, a proprietary data format! My aim in this set of articles is not to actually reveal XML in its every detail, but rather have a look at the implications of XML in today’s technologies. That’s why the previous article only sheds a little light into the insights of XML, which included some terms and technologies you will be confronted with the moment you step in the XML arena.
Java and XML Basics, Part 1 - The javax.xml.parsers Java Package (Page 2 of 5 )
Like every Java library/API, JAXP comes in the form of a single Java package – javax.xml.parsers. You might be surprised to find out that the package itself contains 4 (four!) classes, 1 exception and 1 error. That is because JAXP does not implement a parser in itself, but instead it defines the behaviour that a parser is (at least) to support. The actual parser itself will have to derive these classes and provide concrete classes through the Java extension mechanism.
JAXP is available from Sun in 2 ways:
For JDK/JRE’s prior to 1.4, JAXP is part of the Web Services Developer Kit, freely available from Sun’s website; however, you will need a JAXP-compliant XML parser to go with it.
With JDK 1.4, Sun has included JAXP in the standard distribution of the JDK, and therefore is available out of the box. Even more, JDK 1.4+ comes with Apache’s Crimson XML parser, so you don’t have to worry about that either.
NOTEThroughout our article, I will assume that you have either downloaded the Web Services Developer Kit, a JAXP-compliant XML parser if you have a JDK earlier than 1.4, or that you have installed JDK 1.4. My personal recommendation is to actually use JDK 1.4 (or higher) which, as I said, already includes Crimson and the JAXP package so you don’t need to worry about anything else once installed.
Note for the “hackers” of the Java/XML phenomenon:
OK, so if you’re a “real” Java/XML programmer (that is, you’ve reached the point where you use a device for turning coffee into Java source code), and you think that Crimson is not good enough, listen to this: my personal preference in terms of Java XML parsers is actually Apache Xerces-J (any version over 2.0.0 will do). However, installing it over JDK 1.4+ is not that straight forward, as there’s already an XML parser “known” to the JVM at start-up: Crimson! So just copying the Xerces jars in the class path won’t do the trick! If you are up to such a challenge, jump to my appendix at the end of the article to see how you can do this. (If you are using JDK 1.3 or earlier you don’t need to worry about anything else apart from copying the Xerces jars into your class path, as it’s only with JDK 1.4 that Sun has started shipping an XML parser with the development kit.)
The java.xml.parsers package is structured into 2 factory classes and 2 provider classes. Factory is a common design pattern. However, for those of you unfamiliar with the term, the factory classes are actually used to instantiate and create the provider classes; this means that the provider classes won’t have constructors accessible from outside and cannot be created by simply using the new keyword. This is in order to allow various parser vendors to “plug” their concrete classes into this construction process and provide classes that actually implement the JAXP “interfaces”. (For more details about the Factory design pattern, have a look at the Design Patterns book written by the famous “Gang of Four.”)
Around the time JAXP came out, SAX and DOM were already the 2 approaches to XML processing. Sun has decided to include an interface for each one of these approaches (DocumentBuilder for DOM and SAXParser for SAX); each one of these classes has a builder class (DocumentBuilderFactory and SAXParserFactory). Using these factory classes you create a provider class (either DocumentBuilder or SAXParser) and then call methods on these to parse the XML--and all this regardless of what XML parser is functioning underneath.
A basic JAXP session will look like this:
Instantiate a factory class
Using the factory class instantiate the provider class
Using the provider class created in the previous step perform the XML processing/parsing