XML Basics - XML Tags and Conventions
(Page 2 of 4 )
XML tags have a naming convention. Tag names must not contain spaces, must not start with a number or punctuation character, and must not start with the letters "xml" in either capital or lowercase forms. Tag names can contain letters, numbers, and other characters.
An XML document is said to be well-formed if it follows to the rules of the XML specification. The following list includes some examples of these XML rules:
- All XML elements must have a closing tag. Do not omit the end tag. If an element has no data you can use the empty tag. For example,
- <data> : not well-formed
- <data>data 1</data> : well-formed
- <data/> : well-formed
- <data></data> : well-formed
- Matching Start and End tag. In XML all elements must be properly nested within each other like this:
- <b><i>this is bold and italic</i></b> : well-formed
- <b><i>this is bold and italic</b></i> : not well-formed
- Ensure that a root element exists that encloses the entire document body. This root element makes the processing of the XML document easier when traversing the relevant tree structure.
- In XML the attribute value must always be quoted. Use double quote marks if the value of the attribute contains single quote marks and vice versa.
- <actors name='Frank "Riddler" Gorshin'>
- <actors name="Frank 'Riddler' Gorshin">
- Tag and attribute names are case-sensitive. So, <firstname>, <Firstname> and <FirstName> are different tags.
To be of practical use, an XML document needs to be valid. The XML specification defines an XML document as valid if it has an associated DTD, Document Type Declaration, also if the document complies with the constraints defined in the DTD. A DTD is like the vocabulary and syntax rules for your XML documents. A DTD defines the data structure of an XML document, such as the order in which the tags should appear, and which tags and how many tags are to be specified. Companies that exchange XML documents can check them with the same DTD.
DTD describes elements. It uses the following syntax:
<!ELEMENT name-of-the-element description-of-the-element>
For our Book XML example we can define the <book> element as follows:
<!ELEMENT book (title, author, year, price)>
For the title the DTD definition is:
<!ELEMENT title (#PCDATA)>
#PCDATA and CDATA are pre-defined element types.
#PCDATA is also called parsed character data and is processed or parsed by XML parsers. This character data is to be analyzed. Child elements and character data can be included in the element.
CDATA represents character data that is not analyzed and is not parsed by XML parsers.
Attributes can be declared using the pattern:
<!ATTLIST element-name attribute-name attribute-type default-value>
The id attribute in above XML example can be declared as:
<!ATTLIST title id CDATA "0">
Here, the Element-name is title, Attribute-name is id, Attribute-type is CDATA (in other words, is character data) and the default value is "0."
Next: More on Elements >>
More XML Articles
More By Mamun Zaman