Home arrow XML arrow Page 5 - Java and XML Basics, Part 3

Java and XML Basics, Part 3

So far, during this series of articles (part 1, part 2) we've looked at DOM and SAX, and I suppose most of you are thinking which one of the two approaches is preferable? Well, there is no general rule of thumb, but this article might help you make the right decision when you’ll have to.

Author Info:
By: Liviu Tudor
Rating: 5 stars5 stars5 stars5 stars5 stars / 25
April 20, 2004
  1. · Java and XML Basics, Part 3
  2. · Which One is the Better One to Use?
  3. · Running the Parser
  4. · Problems with Big XML Files
  5. · Validating Parsers - DOM
  6. · Where do We Get a Validating Parser?
  7. · ErrorHandler
  8. · Validating Parsers - SAX
  9. · Conclusion

print this article

Java and XML Basics, Part 3 - Validating Parsers - DOM
(Page 5 of 9 )

Now let's go back to our "performant" parsing applications and try something different: so far we have fed our parsers well-formed XML documents; we have seen in the past that trying to feed it some badly formed XML would throw an exception, but what would happen if we supply a well-formed XML file which is not valid?

For this we will evolve a bit the "employee" example we have presented briefly in the article XML: An Introduction (employee.xml) and run it against SimpleDOMParser.java. Based on the original well-formed and valid XML file, we will change it so it becomes invalid (but still well-formed!):

<?xml version="1.0" encoding="UTF-8"? >
<!DOCTYPE Employees [
<!ELEMENT Employees (Employee)*>
<!ATTLIST Employee
  name  CDATA 
  address CDATA #IMPLIED>
 <Employee surname="Tudor" dob="14/02/1975" email="user@domain.com" address="Coocooland"/>
 <Employee name="Janet" surname="Jackson" dob="unpolite to reveal" email="janet@rhythmnation.com" address="Really really secret ;-)"/>

As you can see, the name attribute is missing for the first element. The result:

java -classpath "%CLASSPATH%;." SimpleDOMParser employee.xml
Parsing successful!

Wow! That I bet you didn't expect that after the whole discussion in this series of articles about valid XML! It looks like our parser is actually considering the XML document as being valid -- even though it's not! So where's the error?

Well, there is no error! The answer is quite simple: parsing an XML document doesn't necessarily means validating it as well! Therefore some parsers will do on-the-fly validation of the XML parsed; others will just assume that the contents is valid and provide no validation at all. A validating parser will complain about such an invalid XML as the one we have just supplied, while a non-validating parser won't "notice" the "invalidity" of the XML.

This means that the parser used in our case is a non-validating one! Ouch! Potentially a big problem. Where do we get a validating parser from then? 

blog comments powered by Disqus

- Open XML Finally Supported by MS Office
- XML Features Added to Two Systems
- Using Regions with XSL Formatting Objects
- Using XSL Formatting Objects
- More Schematron Features
- Schematron Patterns and Validation
- Using Schematron
- Datatypes and More in RELAX NG
- Providing Options in RELAX NG
- An Introduction to RELAX NG
- Path, Predicates, and XQuery
- Using Predicates with XQuery
- Navigating Input Documents Using Paths
- XML Basics
- Introduction to XPath

Watch our Tech Videos 
Dev Articles Forums 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Weekly Newsletter
Developer Updates  
Free Website Content 
Contact Us 
Site Map 
Privacy Policy 

Developer Shed Affiliates


© 2003-2018 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
Popular Web Development Topics
All Web Development Tutorials