Entities, Handlers and SAX
(Page 1 of 4 )
Picking up from where we left off yesterday, we'll take a look at entities and handlers in SAX. This article is excerpted from chapter four of the book
Java and XML, Third Edition, written by Brett McLaughlin and Justin Edelson (O'Reilly, 2006; ISBN: 059610149X). Copyright © 2006 O'Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O'Reilly Media.
Resolving Entities continued
Of course, things are more interesting when you don’t return null . If you return an InputSource from this method, that InputSource is used in resolution of the entity ref erence, rather than the public or system ID specified in your schema. In other words, you can specify your own data instead of letting the reader handle resolution on its own. As an example, create a usage-terms.xml file on your local machine:
Any use of this file could result in your <i>imminent</i> destruction .
Consider yourself warned!
Now you can indicate that this file should be used via resolveEntity() :
private static final String USAGE_TERMS_ID =
"http://www.newInstance.com/entities/ usage-terms.xml";
private static final String USAGE_TERMS_LOCAL_URI =
"/your/path/to/usage-terms.xml";
public InputSource resolveEntity(String publicID, String systemID)
throws IOException, SAXException {
if (systemID.equals(USAGE_TERMS_ID)) {
return new InputSource(USAGE_TERMS_LOCAL_URI);
}
// In the default case, return null
return null;
}
Be sure to change the USAGE_TERMS_LOCAL_URI to match your own file-system path.
You can see that instead of allowing resolution to the online resource, an InputSource that provides access to the local version of copyright.xml is returned. If you recompile your source file and run the tree viewer, you can visually verify that this local copy is used.
You register this resolver on your XMLReader via the setEntityResolver() method, as shown here (using the SAXTreeViewer example again):
// Register content handler
reader.setContentHandler(jTreeHandler);
// Register error handle r
reader.setErrorHandler(jTreeHandler);
// Register entity resolver
reader.setEntityResolver(new SimpleEntityResolver());
// Turn on validation
featureURI = "http://xml.org/sax/features/validation";
reader.setFeature(featureURI, true);
// Turn on schema validation, as well
featureURI = "http://apache.org/xml/features/validation/ schema";
reader.setFeature(featureURI, true);
Figure 4-3 shows the usage-terms entity reference expanded, using the local file, rather than the URI specified in the schema.
In real-world applications, resolveEntity() tends to become a lengthy laundry list of if / then / else blocks, each one handling a specific system or public ID. And this brings up an important point: try to avoid this method becoming a kitchen sink for IDs. If you no longer need a specific resolution to occur, remove the if clause for it.

Figure 4-3. This time, the local entity is used (and parsed, as seen by the expanded i element)
Additionally, try to use different EntityResolver implementations for different appli cations, rather than creating one generic implementation for all your applications. Doing this avoids code bloat, and more important, speeds up entity resolution. If you have to wait for your reader to run through 50 String.equals() comparisons, you can really bog down an application. Be sure to put references accessed often at the top of the if / else stack, as well, so they are encountered first and result in quicker entity resolution.
Finally, Iwant to make one more recommendation concerning your EntityResolver implementations. You’ll notice that Idefined my implementation in a separate class file, while the ErrorHandler , ContentHandler , and (as you’ll see in “Notations and Unparsed Entities”) DTDHandler implementations were in the same source file as parsing occurred in. That wasn’t an accident! You’ll find that the way you deal with content, errors, and DTDs is fairly static. You write your program, and that’s it. When you make changes, you’re performing a larger code rewrite, so recompiling your core parsing program is a given. However, you’ll make many changes to the way you want your application to resolve entities. Depending on the machine you’re on, the type of client you’re deploying to, and what (and where) documents are available, you’ll often use several different versions of an EntityResolver implementation. To allow for rapid changes to this implementation without causing editing or recompilation of your core parsing code, I use a separate source file for EntityResolver implementations; Isuggest you do the same. And with that, you should know all that you need to know about resolving entities in your applications using SAX.
Next: Notations and Unparsed Entities >>
More Java Articles
More By O'Reilly Media
|
This article is excerpted from chapter four of the book Java and XML, Third Edition, written by Brett McLaughlin and Justin Edelson (O'Reilly, 2006; ISBN: 059610149X). Check it out today at your favorite bookstore. Buy this book now.
|
|