XML
  Home arrow XML arrow Page 2 - Parsing XML with SAX and Python
Dev Articles Forums 
ADO.NET  
Apache  
ASP  
ASP.NET  
C#  
C++  
ColdFusion  
COM/COM+  
Delphi-Kylix  
Design Usability  
Development Cycles  
DHTML  
Embedded Tools  
Flash  
Graphic Design  
HTML  
IIS  
Interviews  
Java  
JavaScript  
MySQL  
Oracle  
Photoshop  
PHP  
Reviews  
Ruby-on-Rails  
SQL  
SQL Server  
Style Sheets  
VB.Net  
Visual Basic  
Web Authoring  
Web Services  
Web Standards  
XML  
Dedicated Servers  
Moblin 
JMSL Numerical Library 
IBM® developerWorks 
Sun Developer Network 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
XML

Parsing XML with SAX and Python
By: Nadia Poulou
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 3 stars3 stars3 stars3 stars3 stars / 26
    2004-11-09

    Table of Contents:
  • Parsing XML with SAX and Python
  • The xml.sax Package
  • Our SAX Parser
  • The Heart of the Code
  • Element Content
  • The Main Code
  • Homework

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Parsing XML with SAX and Python - The xml.sax Package


    (Page 2 of 7 )

    SAX is a simple API for XML. The package xml.sax and its sub packages provide a Python implementation of the SAX interface.

    The structure of a SAX application should include one or more input sources, parser and handler objects. The idea is as follows: a parser reads the bytes or characters from the input source and fires a sequence of events on the handler. In this document and in the Python documentation the term ‘reader’ is preferred over ‘parser’.

    The SAX API defines four basic interfaces. Since Python does not support interfaces, these SAX interfaces are implemented in the xml.sax.handler module as the following Python classes:

    1. ContentHandler: this implements the main SAX interface for handling document events. It is also the interface which we will use in the example of the next section

    2. DTDHandler: class for handling DTD events

    3. EntityResolver: class for resolving external entities

    4. ErrorHandler: as the name suggests, this class is used for reporting all errors and warnings.

    I would like to mention here the presence of the DefaultHandler class from the xml.sax.saxutils package that inherits from all four interfaces above. An application needs to implement only the interfaces it needs, as will be shown by the following example.

    Basic Methods

    Now we have checked out the interfaces, it’s time to see the basic methods of the xml.sax package. These are:

    make_parser() - This will create and return an SAX XMLReader object. Notice that the xml.sax readers are non-validating.

    parse(filename, handler) - This will create a parser and parse the given document (which can be passed either as a file object or as a stream). The handler is one of the SAX interfaces we mentioned above.

    A reader and a handler can be connected with the appropriate method (for example setContentHandler() for a ContentHandler object). Once this happens, the reader will notify of parsing events through the methods of the handler. In the following example, the methods startElement(), endElement() and characters() of the ContentHandler illustrate this procedure.

    We will not go into error handling details in this article, but xml.sax provides enough exception classes for your programming needs. In the Python reference documentation, you may find more details.

    Enough theory, let’s move on to a hands-on example.

    More XML Articles
    More By Nadia Poulou


       · Nice article. The correct URL for Uche Ogbuji's site...
       · I'm new to all of this and found this article to be a tremendous help. Thanks so...
       · Thanks for this great article. I have a question. I am using your code to learn xml...
     

    XML ARTICLES

    - Datatypes and More in RELAX NG
    - Providing Options in RELAX NG
    - An Introduction to RELAX NG
    - Path, Predicates, and XQuery
    - Using Predicates with XQuery
    - Navigating Input Documents Using Paths
    - XML Basics
    - Introduction to XPath
    - Simple Web Syndication with RSS 2.0
    - Java UI Design with an IDE
    - UI Design with Java and XML Toolkits
    - Displaying ADO Retrieved Data with XML Islan...
    - Widget Walkthrough
    - Introduction to Widgets
    - The Why and How of XML Data Islands







    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 6 hosted by Hostway