So far, during this series of articles (part 1, part 2) we've looked at DOM and SAX, and I suppose most of you are thinking which one of the two approaches is preferable? Well, there is no general rule of thumb, but this article might help you make the right decision when you’ll have to.
Java and XML Basics, Part 3 - Validating Parsers - DOM (Page 5 of 9 )
Now let's go back to our "performant" parsing applications and try something different: so far we have fed our parsers well-formed XML documents; we have seen in the past that trying to feed it some badly formed XML would throw an exception, but what would happen if we supply a well-formed XML file which is not valid?
For this we will evolve a bit the "employee" example we have presented briefly in the article XML: An Introduction (employee.xml) and run it against SimpleDOMParser.java. Based on the original well-formed and valid XML file, we will change it so it becomes invalid (but still well-formed!):
Wow! That I bet you didn't expect that after the whole discussion in this series of articles about valid XML! It looks like our parser is actually considering the XML document as being valid -- even though it's not! So where's the error?
Well, there is no error! The answer is quite simple: parsing an XML document doesn't necessarily means validating it as well! Therefore some parsers will do on-the-fly validation of the XML parsed; others will just assume that the contents is valid and provide no validation at all. A validating parser will complain about such an invalid XML as the one we have just supplied, while a non-validating parser won't "notice" the "invalidity" of the XML.
This means that the parser used in our case is a non-validating one! Ouch! Potentially a big problem. Where do we get a validating parser from then?