This is chapter one from the book, XML and FrameMaker, by Kay Ethier (Apress, ISBN: 159059276X. 2004). Ethier reviews some of the basic XML terms and rules, and provides a basic overview of the purpose of DTDs and XSLT.
Introduction to XML - Understanding XML Rules (4-5) (Page 5 of 7 )
Rule 4: Tag names can include underscores, letters, and numbers, but not spaces.
Later in this book, you'll look at FrameMaker naming and refer to the issue of spaces inside element names.
In the following accounting-related example, <acctg> is the root element. This example shows a variety of element names, including names with underscores and numbers. Spaces are not allowed in the element names.
It is also important to note that XML is case sensitive, unlike HTML, which allows you to mix case (see following code).
<p>This is a small paragraph of text.</P>
XML beginning and end tags must match in case (see following code).
<Body>This is a small paragraph of text.</Body>
Spaces are not allowed because it causes confusion. Because of the way element attributes are separated by spaces, any tool or human moving through the XML will see a space and assume "attribute."
If the case does not match, then you may get an error when attempting to use or view the XML. Internet Explorer, for example, will not display the file content if case is mismatched. Instead, it displays a message that the end tag does not match the beginning tag.
Rule 5: Except for empty elements, XML elements must have beginning and end tags.
What else might be in an XML document? Well, you might have empty tags. Empty tags may be used in place of beginning and end tags for your elements that have no content. To use some HTML examples to clarify this, in HTML you might have:
NOTE: Spaces are not allowed because it causes confusion. Because of the way element attributes are separated by spaces, any tool or human moving through the XML will see a space and assume "attribute." The <hr> is the HTML element for a horizontal rule.
<img src="corplogo.gif">
or perhaps
<hr>
These are tags with no content, but they are serving some specific function. The equivalents expressed in XML, with the ending slash, are:
<img src="corplogo.gif"/>
and
<hr/>
NOTE: You might need to type a space in front of the empty element's ending slash or your documents may not display in some browsers. This is good practice, and does not cause any issues.
In this next XML document example, <doc> is the root element. An image-type element is used, although in this case it is called <figure>. If you wish to use the more familiar <img> in your XML documents, that is your choice. Again, spacing is used to show the nesting.
<?xml version="1.0"?> <doc> <chapstart> <title>Buying a Car</title> <author> <name>John Doe</name> <figure source="doe02.svg" /> </author> </chapstart> <section> <title>Selecting a Body Type</title> <para>Some text would be here.</para> <para>Some text would be here.</para> </section> <section> <title>Selecting a Manufacturer</title> <para>Some text in here as well.</para> </section> </doc>
One important comment on the preceding example documents: because the tags denote where pieces of information start and stop, you can display XML in several ways.
You can display a document with spacing, as shown in the following code:
Displaying it with or without indents is the same as the code in the following sample:
<?xml version="1.0"><acctg><invoice><inv_num>123</inv_num> <client><company_name>AllianceCorporation</company_name> <contact_name>Nancy </contact_name><address1>4601 Creekstone Drive, Suite 112</address1><address2>PO Box 14265</address2> <city>Research Triangle Park</city><state>NC /state> <zip_plus4>27709-4265</zip_plus4></client><amount>55,400.00 </amount> <due_date>2003/11/02</due_date></invoice></acctg>The line breaks are used mostly to make it easier on human editors and readers. All views are equivalent XML, but some are easier to read.
NOTE When importing XML into FrameMaker, FrameMaker does not filter out white space. Depending on whether the spaces (or tabs) are inside or between elements, you can end up with unwanted elements <WHITESPACE> that must be cleaned out. To avoid this, do not use spaces/tabs within your XML.
This chapter is from XML and Framemaker, by Kay Ethier (Apress, 2004, ISBN: 159059276X). Check it out at your favorite bookstore today.