Using XML in a Java Context - Formatting an XML Document
(Page 3 of 5 )
As described earlier, XOM does not retain insignificant whitespace when representing XML documents. This is in keeping with one of XOM's design goals—to disregard anything that has no syntactic significance in XML. (Another example of this is how text is treated identically whether created using character entities, CDATA sections, or regular characters.)
Today's next project is the SerialDomains application, which adds a comment to the beginning of the XML document domains2.xml and serializes it with indented lines, producing the version shown in Listing 20.9.
Listing 20.9 The Full Text of domains2.xml
1: <?xml version="1.0" encoding="ISO-8859-1"?>
2: <!--File created Fri Nov 21 17:31:47 EST 2003-->
3: <domains>
4: <domain>
5: <name>java21days.com</name>
6: <dns>
7: <ttl>604800</ttl>
8: <ip>64.81.250.253</ip>
9: </dns>
10: <webdir status="parked">/web/
java21days</webdir>
11: <email>
12: <address user=postmaster@java21days.com
destination="rcade"/>
13: </email>
14: </domain>
15: </domains>
The Serializer class in nu.xom offers control over how an XML document is formatted when it is displayed or stored serially. Indentation, character encoding, line breaks, and other formatting are established by objects of this class.
A Serializer object can be created by specifying an output stream and character encoding as arguments to the constructor:
File inFile = new File(arguments[0]);
FileOutputStream fos =
new FileOutputStream("new_" +
inFile.getName());
Serializer output = new Serializer(fos,
"ISO-8859-1");
These statements serialize a file using the ISO-8859-1 character encoding. The file is given a name based on a command-line argument.
Serializer currently supports 20 encodings, including ISO-10646-UCS-2, ISO-8859-1 through ISO-8859-10, ISO-8859-13 through ISO-8859-16, UTF-8, and UTF-16. There's also a Serializer() constructor that takes only an output stream as an argument; this uses the UTF-8 encoding by default.
Indentation is set by calling the serializer's setIndentation() method with an integer argument specifying the number of spaces:
output.setIndentation(2);
An entire XML document is written to the serializer destination by calling the serializer's write() method with the document as an argument:
output.write(doc);
The SerialDomains application inserts a comment atop the XML document instead of appending it at the end of a parent node's children. This requires another method of the parent node, insertChild(), which is called with two arguments: the element to add and the integer position of the insertion:
Builder builder = new Builder();
Document doc = builder.build(arguments[0]);
Comment timestamp = new Comment("File created " +
new java.util.Date());
doc.insertChild(timestamp, 0);
The comment is placed at position 0 atop the document, moving the domains tag down one line but remaining below the XML declaration.
Listing 20.10 contains the source code of the application.
Listing 20.10 The Full Text of SerialDomains.java
1: import java.io.*;
2: import nu.xom.*;
3:
4: public class SerialDomains {
5: public static void main(String[] arguments)
throws IOException {
6: try {
7: // Create a tree from an XML document
8: // specified as a command-line argument
9: Builder builder = new Builder();
10: Document doc =
builder.build(arguments[0]);
11:
12: // Create a comment with the current time
and date
13: Comment timestamp = new Comment("File
created "
14: + new java.util.Date());
15:
16: // Add the comment above everything else
in the
17: // document
18: doc.insertChild(timestamp, 0);
19:
20: // Create a file output stream to a new
file
21: File inFile = new File(arguments[0]);
22: FileOutputStream fos = new
FileOutputStream(
23: "new_" + inFile.getName());
24:
25: // Using a serializer with indention set
to 2 spaces,
26: // write the XML document to the file
27: Serializer output = new Serializer(fos,
"ISO-8859-1");
28: output.setIndent(2);
29: output.write(doc);
30: } catch (ParsingException pe) {
31: System.out.println("Error: " +
pe.getMessage());
32: pe.printStackTrace();
33: System.exit(-1);
34: }
35: }
36: }
The SerialDomains application takes an XML filename as a command-line argument when run:
java SerialDomains domains2.xml
This command produces a file called new_domains2.xml that contains an indented copy of the XML document with a time stamp inserted as a comment. This document was shown earlier in Listing 20.9.
Next: Evaluating XOM >>
More Java Articles
More By Sams Publishing
|
This article is excerpted from chapter 20 of the book Sams Teach Yourself Java 2 in 21 Days, 4th Edition, written by Rogers Cadenhead and Laura Lemay (Sams; ISBN: 0672326280). Check it out today at your favorite bookstore. Buy this book now.
|
|