Home arrow XML arrow Page 3 - Java and XML Basics, Part 3
XML

Java and XML Basics, Part 3


So far, during this series of articles (part 1, part 2) we've looked at DOM and SAX, and I suppose most of you are thinking which one of the two approaches is preferable? Well, there is no general rule of thumb, but this article might help you make the right decision when you’ll have to.

Author Info:
By: Liviu Tudor
Rating: 5 stars5 stars5 stars5 stars5 stars / 25
April 20, 2004
TABLE OF CONTENTS:
  1. · Java and XML Basics, Part 3
  2. · Which One is the Better One to Use?
  3. · Running the Parser
  4. · Problems with Big XML Files
  5. · Validating Parsers - DOM
  6. · Where do We Get a Validating Parser?
  7. · ErrorHandler
  8. · Validating Parsers - SAX
  9. · Conclusion

print this article
SEARCH DEVARTICLES

Java and XML Basics, Part 3 - Running the Parser
(Page 3 of 9 )

Running SimpleSAXParser7 against our first XML example (simple1.xml) produces the following results:

java -classpath "%CLASSPATH%;." SimpleSAXParser7 simple1.xml
Parsing took 50 msec
Memory occupied 1496 bytes
Parsing successful!

Running SimpleDOMParser4 against same file produces:

java -classpath "%CLASSPATH%;." SimpleDOMParser4 simple1.xml
Parsing took : 50 msec
Memory occupied : 5680 bytes
Parsing successful!
Traversing the DOM took 10 msec
Total processing time 60 msec

So over all a delay of about 10 msec (remember that the figures are approximate), and a memory consumption which is bigger by about 4kb when using DOM as opposed to SAX. Now, that isn't too much, most of you will agree (ok, apart from those of you that can are still fanatic about the 48k that the good ole' Sinclair Spectrum used to have), but let's put this in perspective: the simple1.xml file is just slightly over 0.5 Kb, so what would happen if we were to process a large XML file? To test this, I've copied and pasted the data in simple1.xml file quite a few times and ran the tests against this new file (simple4.xml which is now about 65kb worth of XML). The results speak for themselves:

java -classpath "%CLASSPATH%;." SimpleSAXParser7 simple4.xml

Parsing took 191 msec
Memory occupied 1496 bytes
Parsing successful!

java -classpath "%CLASSPATH%;." SimpleDOMParser4 simple4.xml
Parsing took : 300 msec
Memory occupied : 438856 bytes
Parsing successful!
Traversing the DOM took 681 msec
Total processing time 991 msec

Now this is significant! While the SAX approach will only take less then 0.2 seconds, the DOM implementation goes to nearly a whole second; also, the memory taken by the DOM approach is huge compared to the (nearly) 1.5Kb in the case of SAX!

The explanation for this is quite simple:

  1. The DOM parser will have to build a document tree as the parsing goes on -- this takes both time and memory; while in the case of SAX we process the data as it "arrives", without necessary storing it.

  2. Once the document tree is created (in the case of DOM) we then have to step through this tree and find/retrieve the relevant information for us -- in other way we actually step through the DOM tree twice every parsing!

Assuming that we only make usage of one single item in the whole XML document, the (nearly) 0.5 Mb taken by DOM becomes a waste. However, if we need to come back to the parsed information very often and we are going to make usage of most of the data, then using SAX might be pointless as we will have to gather all this data in a tree/list/stack/array and that might be too much of an overload in terms of programming when it's easier to just use the DOM API to traverse the tree and retrieve individual elements. Also, the DOM API is easier to use from a programmer's point of view, and as the differences are nearly unnoticeable in the case of small files, you could use the DOM approach without too much of an overhead.


blog comments powered by Disqus
XML ARTICLES

- Open XML Finally Supported by MS Office
- XML Features Added to Two Systems
- Using Regions with XSL Formatting Objects
- Using XSL Formatting Objects
- More Schematron Features
- Schematron Patterns and Validation
- Using Schematron
- Datatypes and More in RELAX NG
- Providing Options in RELAX NG
- An Introduction to RELAX NG
- Path, Predicates, and XQuery
- Using Predicates with XQuery
- Navigating Input Documents Using Paths
- XML Basics
- Introduction to XPath

Watch our Tech Videos 
Dev Articles Forums 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Contact Us 
Site Map 
Privacy Policy 
Support 

Developer Shed Affiliates

 




© 2003-2017 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
Popular Web Development Topics
All Web Development Tutorials