Home arrow XML arrow Page 6 - Parsing XML with SAX and Python
XML

Parsing XML with SAX and Python


In this article Nadia explains how to parse an XML document using the SAX API implementation available for Python.

Author Info:
By: Nadia Poulou
Rating: 4 stars4 stars4 stars4 stars4 stars / 56
November 09, 2004
TABLE OF CONTENTS:
  1. · Parsing XML with SAX and Python
  2. · The xml.sax Package
  3. · Our SAX Parser
  4. · The Heart of the Code
  5. · Element Content
  6. · The Main Code
  7. · Homework

print this article
SEARCH DEVARTICLES

Parsing XML with SAX and Python - The Main Code
(Page 6 of 7 )

 

The following snippet gets the player name from the playerName field of the form:

FormData = cgi.FieldStorage()
searchTerm= FormData["playerName"].value

Now that we have our search term, letís initiate our parser and handler objects:

parser = make_parser()   
curHandler = BasketBallHandler(searchTerm)

With the help of the method setContentHandler(), we connect the implementation of the ContentHandler to our reader instance as it is shown here:

parser.setContentHandler(curHandler)

Finally we parse our XML document:

parser.parse(open('playerStats.xml'))

Here I paste the code of the finished script as a reference:

#!/usr/bin/python"

print "Content-Type: text/plain\n"   
print "<html><body>"

from xml.sax import make_parser
from xml.sax.handler import ContentHandler
import cgi

class BasketBallHandler(ContentHandler):

 def __init__ (self, searchTerm):
   self.searchTerm= searchTerm;
   self.isPointsElement, self.isReboundsElement = 0, 0;
   
 def startElement(self, name, attrs):

   if name == 'player':     
     self.playerName = attrs.get('name',"")
     self.playerAge = attrs.get('age',"")
     self.playerHeight = attrs.get('height',"")
   elif name == 'points':
     self.isPointsElement= 1;
     self.playerPoints = "";
   elif name == 'rebounds':
     self.isReboundsElement = 1;
     self.playerRebounds = "";
   return

 def characters (self, ch):
   if self.isPointsElement== 1:
     self.playerPoints += ch
   if self.isReboundsElement == 1:
     self.playerRebounds += ch

 def endElement(self, name):
   if name == 'points':
     self.isPointsElement= 0
   if name == 'rebounds':
     self.inPlayersContent = 0
   if name == 'player' and self.searchTerm== self.playerName :
       print '<h2>Statistics for player:' , self.playerName, '</h2><br>(age:', self.playerAge , 'height' , self.playerHeight , ")<br>"
       print 'Match average:', self.playerPoints , 'points,' , self.playerRebounds, 'rebounds'

FormData = cgi.FieldStorage()
searchTerm= FormData["playerName"].value
parser = make_parser()   
curHandler = BasketBallHandler(searchTerm)
parser.setContentHandler(curHandler)
parser.parse(open('playerStats.xml'))
print "</body></html>"


blog comments powered by Disqus
XML ARTICLES

- Open XML Finally Supported by MS Office
- XML Features Added to Two Systems
- Using Regions with XSL Formatting Objects
- Using XSL Formatting Objects
- More Schematron Features
- Schematron Patterns and Validation
- Using Schematron
- Datatypes and More in RELAX NG
- Providing Options in RELAX NG
- An Introduction to RELAX NG
- Path, Predicates, and XQuery
- Using Predicates with XQuery
- Navigating Input Documents Using Paths
- XML Basics
- Introduction to XPath

Watch our Tech Videos 
Dev Articles Forums 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Contact Us 
Site Map 
Privacy Policy 
Support 

Developer Shed Affiliates

 




© 2003-2017 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
Popular Web Development Topics
All Web Development Tutorials