When I first heard about XML, I thought it was something similar to HTML. Needless to say, I was wrong. XML and HTML were designed with different goals. XML was designed to describe data and HTML was designed to display data. In this article we will try to learn some basics about XML. Then we will learn about DTD and XML schemas.
XML Basics - XML Tags and Conventions (Page 2 of 4 )
XML tags have a naming convention. Tag names must not contain spaces, must not start with a number or punctuation character, and must not start with the letters "xml" in either capital or lowercase forms. Tag names can contain letters, numbers, and other characters.
An XML document is said to be well-formed if it follows to the rules of the XML specification. The following list includes some examples of these XML rules:
All XML elements must have a closing tag. Do not omit the end tag. If an element has no data you can use the empty tag. For example,
<data> : not well-formed
<data>data 1</data> : well-formed
<data/> : well-formed
<data></data> : well-formed
Matching Start and End tag. In XML all elements must be properly nested within each other like this:
<b><i>this is bold and italic</i></b> : well-formed
<b><i>this is bold and italic</b></i> : not well-formed
Ensure that a root element exists that encloses the entire document body. This root element makes the processing of the XML document easier when traversing the relevant tree structure.
In XML the attribute value must always be quoted. Use double quote marks if the value of the attribute contains single quote marks and vice versa.
<actors name='Frank "Riddler" Gorshin'>
<actors name="Frank 'Riddler' Gorshin">
Tag and attribute names are case-sensitive. So, <firstname>, <Firstname> and <FirstName> are different tags.
To be of practical use, an XML document needs to be valid. The XML specification defines an XML document as valid if it has an associated DTD, Document Type Declaration, also if the document complies with the constraints defined in the DTD. A DTD is like the vocabulary and syntax rules for your XML documents. A DTD defines the data structure of an XML document, such as the order in which the tags should appear, and which tags and how many tags are to be specified. Companies that exchange XML documents can check them with the same DTD.
DTD describes elements. It uses the following syntax:
For our Book XML example we can define the <book> element as follows:
<!ELEMENT book (title, author, year, price)>
For the title the DTD definition is:
<!ELEMENT title (#PCDATA)>
#PCDATA and CDATA are pre-defined element types.
#PCDATA is also called parsed character data and is processed or parsed by XML parsers. This character data is to be analyzed. Child elements and character data can be included in the element.
CDATA represents character data that is not analyzed and is not parsed by XML parsers.