XML
  Home arrow XML arrow Page 2 - The Makeup of an XML Document – A Quick Pr...
Dev Articles Forums 
ADO.NET  
Apache  
ASP  
ASP.NET  
C#  
C++  
ColdFusion  
COM/COM+  
Delphi-Kylix  
Design Usability  
Development Cycles  
DHTML  
Embedded Tools  
Flash  
Graphic Design  
HTML  
IIS  
Interviews  
Java  
JavaScript  
MySQL  
Oracle  
Photoshop  
PHP  
Reviews  
Ruby-on-Rails  
SQL  
SQL Server  
Style Sheets  
VB.Net  
Visual Basic  
Web Authoring  
Web Services  
Web Standards  
XML  
Dedicated Servers  
Moblin 
JMSL Numerical Library 
IBM® developerWorks 
Sun Developer Network 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
XML

The Makeup of an XML Document – A Quick Primer
By: Zaid Siddiqui
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 4 stars4 stars4 stars4 stars4 stars / 6
    2002-11-07

    Table of Contents:
  • The Makeup of an XML Document – A Quick Primer
  • XML and Parsing XML Documents
  • Conclusion

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    The Makeup of an XML Document – A Quick Primer - XML and Parsing XML Documents


    (Page 2 of 3 )

    An XML document is a tagged data file. The tags in an XML document define the structures and boundaries of the embedded data elements. The syntax of the tags is very similar to that of HTML. Parsing XML simply means retrieving data from an XML document based on its meaning and structure.

    Listed below is a sample XML document that contains a mail message:

    //mail.xml
    <?xml version="1.0"?>
    <IDOCTYPE mail SYSTEM "mail.dtd" [
    <IENTITY from "from@from.com">
    <IENTITY to "somebody@somewhere.com">
    <IENTITY cc "you@you.com"> ]>
    <mail>
    <From> &from; </From>
    <To> &to; </To>
    <Cc> &cc; </Cc>
    <Date>Fri, 12 Jan 2001 10:21:56 -0600</Date> <Subject>XML and Parsing XML Documents </Subject>
    <Body language="english">
    An XML document is a tagged data file. The tags in an XML document define the structures and boundaries of the embedded data elements.
    <Signature>
    Zaid &from; http://www.devarticles.com </Signature>
    </Body>
    </mail>


    In general, there are four main components associated with an XML document: elements, attributes, entities, and DTD's.

    An element is something that describes a piece of data. An element is comprised of markup tags and the element's content. The following is an element in listed above XML file (mail.xml):

    <Subject> XML and Parsing XML Documents </Subject>

    It contains a start tag, <Subject>, the content XML parsers for J2ME MIDP, and an end tag, </Subject>.

    An attribute is used in an element to provide additional information about the element. It usually resides inside the start tag of an element. In the following example, language is an attribute of the element Body that describes the language used in the message body:

    <Body language="english">

    An entity is a virtual storage of a piece of data (either text data or binary data) that you can reference in an XML document. Entities can be further categorized into internal entities and external entities. An internal entity is defined inside an XML document and doesn't reference any outside content. For example, "from" is an internal entity defined in our XML file above:

    <IENTITY from "from@from.com">

    The entity "from" is referenced later on in the XML document as &from;. When the XML document is parsed, the parser simply replaces the entity with its actual value: from@from .com.

    An external entity refers to content outside an XML document. Its content is usually a filename or a URL proceeded with a SYSTEM or PUBLIC identifier. SYSTEM means that the filename exists on the local PC. PUBLIC means that the file can be accessed online, usually being prefixed with "http://". The following is an example of an external entity, iconimage, that references a local file called icon.png:

    <IENTITY iconimage SYSTEM "icon.png" NDATA png>

    A Document Type Definition (DTD) is an optional portion of XML that defines the allowable structure for a particular XML document. Think of DTD as the roadmap or rulebook of the XML document. The code listed below shows the DTD definition for the XML file (mail.xml) listed above:

    // mail. dtd
    <IELEMENT mail (From, To, Cc, Date, Subject, Body)>
    <IELEMENT From (#PCDATA)>
    <IELEMENT To (#PCDATA)>
    <IELEMENT Cc (#PCDATA)>
    <IELEMENT Date (#PCDATA)>
    <IELEMENT Subject (#PCDATA)>
    <IELEMENT Signature (#PCDATA)>
    <IELEMENT Body (#PCDATAISignature)+>


    This DTD basically says that the element called mail contains six sub-elements: From, To, Cc, Date, Subject, and Body. The term #PCDATA refers to the "Parsed Character Data," which indicates that an element can contain only text. The last line of the DTD definition indicates that the element “Body” could contain mixed contents that include text, sub-element Signature, or both.

    Event-Based XML Parser Versus Tree-Based XML Parser
    There are 2 types of interfaces available for parsing XML documents: the event-based interface, and the tree-based interface.

    Event-Based XML Parsers
    An event-based XML parser reports parsing events directly to the application through callback methods. It provides a serial-access mechanism for accessing XML documents. Applications that use a parser's event-based interface need to implement the interface's event handlers to receive parsing events.

    The Simple API for XML (SAX) is an industry standard event-based interface for XML parsing. The SAX 1.0 Java API defines several callback methods in one of its interface classes. The applications need to implement these callback methods to receive parsing events from the parser. For example, the startElement ( ) is one of these callback methods. When a SAX parser reaches the start tag of an element, the application that implements the parser's startElement ( ) method will receive the event. It will also receive the tag name through one of the method's parameters.

    Tree-Based XML Parsers
    A tree-based XML parser reads an entire XML document into an internal tree structure in memory. Each node of the tree represents a piece of data from the original document. This method allows an application to navigate and manipulate the parsed data quickly and easily.

    The Document Object Model (DOM) is an industry standard tree-based interface for XML parsing. A DOM parser can be very memory and CPU intensive, because it keeps the whole data structure in memory. A DOM parser may arise performance issues for your wireless applications, especially when the XML document to be parsed is large and complex.

    In general, SAX parsers are faster and consume less CPU and memory than DOM parsers. However, SAX parsers allow only serial access to the XML data. A DOM parsers' tree-structured data is easier to access and manipulate. SAX parsers are often used by Java servlets or network oriented programs to transmit and receive XML documents in a fast and efficient fashion. DOM parsers are often used for manipulating XML documents that exist physically, such as a configuration file or an already saved order.

    More XML Articles
    More By Zaid Siddiqui


     

    XML ARTICLES

    - Datatypes and More in RELAX NG
    - Providing Options in RELAX NG
    - An Introduction to RELAX NG
    - Path, Predicates, and XQuery
    - Using Predicates with XQuery
    - Navigating Input Documents Using Paths
    - XML Basics
    - Introduction to XPath
    - Simple Web Syndication with RSS 2.0
    - Java UI Design with an IDE
    - UI Design with Java and XML Toolkits
    - Displaying ADO Retrieved Data with XML Islan...
    - Widget Walkthrough
    - Introduction to Widgets
    - The Why and How of XML Data Islands







    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 5 hosted by Hostway