Last time, we learned about JAXP, Xerces, DOM and the javax.xml.parsers Java Package. How about getting a little taste of the SAX interfaces? We look at available classes and interfaces, and learn how to use SAX for XML Processing. Given SAX's power, perhaps we can look forward to the day when we'll be translating not just XML, but maybe even Klingon! Maybe not. Before you get started, you'll want to download the support files for this tutorial.
Java and XML Basics, Part 2 - DefaultHandler class (Page 2 of 6 )
DefaultHandler class implements the following 4 handler interfaces:
ContentHandler
DTDHandler
EntityResolver
ErrorHandler
While the DTDHandler and the EntityResolver are not our concern at the moment (we will be dealing with them later on, for the moment we just need to know the that DTDHandler receives DTD notifications and the EntityResolver is responsible for locating entities – like external references, DTDs etc.), we will make use of the ContentHandler and the ErrorHandler.
As the names implies, the ContentHandler class receives notifications about the actual content of the XML document that has been parsed so far – for example, once an element has been encountered, the SAX parser will send 2 messages to the ContentHandler class, one to inform of the start of the element, and the second to notify of the end of this element tag. The ErrorHandler on the other side will receive messages regarding errors/problems encountered during parsing; from SAX’s point of view, there are 3 categories of problems in terms of parsing:
Warnings - when the parser has found irregularities in place, but they are minor and they don’t affect the rest of the parsing
Errors - when the parser has found major irregularities in place that might affect the rest of the parsing, if this continues
Fatal Errors - when major problems have been encountered and they prevent the parsing process to go any further.
When one of these notifications is received by the ErrorHandler class (the DefaultHandler in our case), it is up to the program to decide whether the parsing should continue or not – the XML purists for example might want to stop the parsing process even if a warning is fired, while people like me probably wouldn’t want to bother even with the fatal errors!
Now, let’s change our code a little and monitor the errors and the fatal errors encountered during parsing (SimpleSAXParser2.java). In order to achieve this, we need to override the following 2 functions:
... public void error( SAXParseException e ) throws SAXException { System.err.println( "Following error occured during parsing:" ); e.printStackTrace(); System.err.println( "Parsing will continue..." ); } public void fatalError( SAXParseException e ) throws SAXException { System.err.println( "Following fatal error occured during parsing:" ); e.printStackTrace(); System.err.println( "Parsing will stop." ); throw new SAXException( e ); } ...
If we run this against our simple2.xml file, we get the following result:
java
-classpath "%CLASSPATH%;." SimpleSAXParser2 simple2.xml Following fatal error occured during parsing: org.xml.sax.SAXParseException: Expected "</duration>" to terminate element start ing on line 5. at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3182) at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3176) at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1513) at org.apache.crimson.parser.Parser2.content(Parser2.java:1779) at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1507) at org.apache.crimson.parser.Parser2.content(Parser2.java:1779) at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1507) at org.apache.crimson.parser.Parser2.content(Parser2.java:1779) at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1507) at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:500) at org.apache.crimson.parser.Parser2.parse(Parser2.java:305) at org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:442) at javax.xml.parsers.SAXParser.parse(SAXParser.java:345) at javax.xml.parsers.SAXParser.parse(SAXParser.java:143) at SimpleSAXParser2.main(SimpleSAXParser2.java:98) Parsing will stop. Parsing error: org.xml.sax.SAXParseException: Expected "</duration>" to terminate element start ing on line 5. at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3182) at org.apache.crimson.parser.Parser2.fatal(Parser2.java:3176) at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1513) at org.apache.crimson.parser.Parser2.content(Parser2.java:1779) at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1507) at org.apache.crimson.parser.Parser2.content(Parser2.java:1779) at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1507) at org.apache.crimson.parser.Parser2.content(Parser2.java:1779) at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1507) at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:500) at org.apache.crimson.parser.Parser2.parse(Parser2.java:305) at org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:442) at javax.xml.parsers.SAXParser.parse(SAXParser.java:345) at javax.xml.parsers.SAXParser.parse(SAXParser.java:143) at SimpleSAXParser2.main(SimpleSAXParser2.java:98)
Maybe you didn’t see that one coming, but indeed the error we have introduced was a fatal error – the duration tag not being closed destroyed the whole DOM model (after all, could YOU tell at this point based on this XML document whether the intention was to include another duration tag inside the duration tag, which in turn was meant to have another files, session and so on tags inside, or was it the case that we just wanted to append another duration tag after having closed the first one or is it simply that a / is missing?) – which means in turn that parsing cannot continue anymore as the parser can’t figure out in which state it is.