In a previous article (XML Basics Part One);we had a brief look at XML. However, as stated in the article, XML itself is worth nothing without the set of APIs that are generated-it would simply be just another fancy form of CSV, that is, a proprietary data format! My aim in this set of articles is not to actually reveal XML in its every detail, but rather have a look at the implications of XML in today’s technologies. That’s why the previous article only sheds a little light into the insights of XML, which included some terms and technologies you will be confronted with the moment you step in the XML arena.
Although it may sound like it, this is not another chapter about Jack and the magic Beans. Instead it will reveal one of the basic XML APIs in Java: the parsing API.
As soon as XML came out, numerous parsers sprung up, each one promising to solve all the problems encountered when dealing with XML and much more. In the past marketing people would sweat all day trying to find a name resembling a slight “aroma” of coffee in the hopes that it would boost the sale of their products – need I remind you of all those product names ending in Café or beginning with a “J”?!).
This time the increase in share prices was dictated by an “X” in the product name. So these poor software vendors found themselves once again having to make the decision on which XML parser to choose by using the old coin toss method. And since the coin has only 2 faces…
The bottom line is that you suddenly find yourself in a situation where you are tied already into an XML parser, its API--which doesn’t turn out to solve everything, is not a validating parser, and doesn’t look like it supports the latest additions from W3C--and its license costs! Your golfer friend working for a company, whose name contains an X somewhere, (you can’t quite remember where exactly but it doesn’t matter nowadays--it’s only important that they do some XML stuff) tells you of a new parser, much better than the one you have purchased. “Man, it does wonders: you take 100k worth of XML and in less than a second it’s all DOM’d up!” "Wow! Now that’s something!" you think. Go back to your programmers and announce, “Guys, this is the future, look into it because that’s the way we’re going!” Two days later your technical leader brings you the report about switching to “tomorrow’s XML parser” and you go all pale. The actual work involved to change all the code to use this (totally) different API is far higher than what the company is prepared to support. In fact, paying the licenses for the old parser is much less expensive!
As much as it is an exaggerated projection of a typical Development Manager back in the early XML era, unfortunately it is not too far from the truth. Even nowadays, every parser has its own particularities, implementation and its own set of classes, packages and so on. Writing portable code in terms of parsers would be impossible without a high level standard. And that is the whole purpose of JAXP (Java API for XML Processing).
You may be wondering why we need more than one XML parser. Why not? After all, not all of us want a parser with a complex implementation, and one that covers every single aspect of XML, thus reaching considerable sizes (and size DOES matter when it comes to embedded code), possible slower processing speed, etc. In fact, all we want is to parse a simple XML structure--that doesn’t need to be validated against a DTD. Or maybe it only needs to be validated against a DTD, but without schema support, as you’re not going to use them. Or maybe you are not using them now, but you may want to leave the door open and use them in the future--even though this may mean switching to a different XML parser (as there are still a few out there, including XPath, Apache Crimson, Apache Xerces, and so on).
Being able to mix and match parsers as you wish, however, means they all have to have a common ground; bearing in mind the rule established by the “Gang of Four” design patterns “cookbook” that you should program as much as possible at the interface level rather than at the class level, Sun together with the JCP has come up with JAXP, a high level set of (mostly abstract) classes that are meant to hide the insides of the XML parser from the programmer. This, together with the DOM API from the W3C provides a portable way to write your Java code for any JAXP-compliant parser.