XML
  Home arrow XML arrow Page 5 - Parsing XML with SAX and Python
Dev Articles Forums 
ADO.NET  
Apache  
ASP  
ASP.NET  
C#  
C++  
ColdFusion  
COM/COM+  
Delphi-Kylix  
Design Usability  
Development Cycles  
DHTML  
Embedded Tools  
Flash  
Graphic Design  
HTML  
IIS  
Interviews  
Java  
JavaScript  
MySQL  
Oracle  
Photoshop  
PHP  
Reviews  
Ruby-on-Rails  
SQL  
SQL Server  
Style Sheets  
VB.Net  
Visual Basic  
Web Authoring  
Web Services  
Web Standards  
XML  
Mobile Linux 
App Generation ROI 
IBM® developerWorks 
Sun Developer Network 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
XML

Parsing XML with SAX and Python
By: Nadia Poulou
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 3 stars3 stars3 stars3 stars3 stars / 27
    2004-11-09

    Table of Contents:
  • Parsing XML with SAX and Python
  • The xml.sax Package
  • Our SAX Parser
  • The Heart of the Code
  • Element Content
  • The Main Code
  • Homework

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Parsing XML with SAX and Python - Element Content


    (Page 5 of 7 )

    The elements ‘points’ and ‘rebounds’ in our XML document are a little different, in the sense that their value is not set in element properties. This means that what we need to parse is the element content. This is the job of the characters() method, where our variables playerPoints and playerRebounds will be loaded. This is why, at the moment a ‘points’ or ‘rebound’ element is found, we set our flags to 1. Here is how our startElement() method looks like:

    def startElement(self, name, attrs):

       if name == 'player':     
         self.playerName = attrs.get('name',"")
         self.playerAge = attrs.get('age',"")
         self.playerHeight = attrs.get('height',"")
       elif name == 'points':
         self.isPointsElement= 1;
         self.playerPoints = "";
       elif name == 'rebounds':
         self.isReboundsElement = 1;
         self.playerRebounds = "";
       return

    In the endElement() method we finally do the comparison of our search term with the value of the ‘name’ property. If they match, then we print our output. You can format this output anyway you like. This is also the proper place to re-set our flags, before the parser moves to the next element.

    Here is how our endElement() method looks:

     def endElement(self, name):
       if name == 'points':
         self.isPointsElement= 0
       if name == 'rebounds':
         self.inPlayersContent = 0
       if name == 'player' and self.searchTerm== self.playerName :
           print '<h2>Statistics for player:' , self.playerName, '</h2><br>(age:', self.playerAge , 'height' , self.playerHeight , ")<br>"
           print 'Match average:', self.playerPoints , 'points,' , self.playerRebounds, 'rebounds'

    The characters() method is invoked whenever a chunk of character data is found. Here is the place we use the flags set in our startElement() method; when they have the value of ‘1’ we load our variables with the data. Please note that all our character data are not necessarily returned in a single call. The function may split it in more than one chunks.

    Here is how our characters() method looks:

    def characters (self, ch):
       if self.isPointsElement== 1:
         self.playerPoints += ch
       if self.isReboundsElement == 1:
         self.playerRebounds += ch

    So, this is it with the basic structure of our application! If you remember, this script will be called from a Web form and our search term is in the field ‘playerName’ of this form. The following part is the ‘main’ code that does this job and uses the methods defined earlier.

    More XML Articles
    More By Nadia Poulou


       · Nice article. The correct URL for Uche Ogbuji's site...
       · I'm new to all of this and found this article to be a tremendous help. Thanks so...
       · Thanks for this great article. I have a question. I am using your code to learn xml...
     

    XML ARTICLES

    - Datatypes and More in RELAX NG
    - Providing Options in RELAX NG
    - An Introduction to RELAX NG
    - Path, Predicates, and XQuery
    - Using Predicates with XQuery
    - Navigating Input Documents Using Paths
    - XML Basics
    - Introduction to XPath
    - Simple Web Syndication with RSS 2.0
    - Java UI Design with an IDE
    - UI Design with Java and XML Toolkits
    - Displaying ADO Retrieved Data with XML Islan...
    - Widget Walkthrough
    - Introduction to Widgets
    - The Why and How of XML Data Islands






    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 1 hosted by Hostway
    Stay green...Green IT