Java and XML Basics, Part 3 - Validating Parsers - DOM
(Page 5 of 9 )
Now let's go back to our "performant" parsing applications and try something different: so far we have fed our parsers well-formed XML documents; we have seen in the past that trying to feed it some badly formed XML would throw an exception, but what would happen if we supply a well-formed XML file which is not valid?
For this we will evolve a bit the "employee" example we have presented briefly in the article XML: An Introduction (employee.xml) and run it against SimpleDOMParser.java. Based on the original well-formed and valid XML file, we will change it so it becomes invalid (but still well-formed!):
<?xml version="1.0" encoding="UTF-8"? >
<!DOCTYPE Employees [
<!ELEMENT Employees (Employee)*>
<!ELEMENT Employee EMPTY>
<!ATTLIST Employee
name CDATA #REQUIRED
surname CDATA #REQUIRED
dob CDATA #IMPLIED
email CDATA #IMPLIED
phone CDATA #IMPLIED
address CDATA #IMPLIED>
]>
<Employees>
<Employee surname="Tudor" dob="14/02/1975" email="user@domain.com" address="Coocooland"/>
<Employee name="Janet" surname="Jackson" dob="unpolite to reveal" email="janet@rhythmnation.com" address="Really really secret ;-)"/>
</Employees>
As you can see, the name attribute is missing for the first element. The result:
java -classpath "%CLASSPATH%;." SimpleDOMParser employee.xml
Parsing successful!
Wow! That I bet you didn't expect that after the whole discussion in this series of articles about valid XML! It looks like our parser is actually considering the XML document as being valid -- even though it's not! So where's the error?
Well, there is no error! The answer is quite simple: parsing an XML document doesn't necessarily means validating it as well! Therefore some parsers will do on-the-fly validation of the XML parsed; others will just assume that the contents is valid and provide no validation at all. A validating parser will complain about such an invalid XML as the one we have just supplied, while a non-validating parser won't "notice" the "invalidity" of the XML.
This means that the parser used in our case is a non-validating one! Ouch! Potentially a big problem. Where do we get a validating parser from then?
Next: Where do We Get a Validating Parser? >>
More XML Articles
More By Liviu Tudor