Parsing XML with SAX and Python - Element Content
(Page 5 of 7 )
The elements ‘points’ and ‘rebounds’ in our XML document are a little different, in the sense that their value is not set in element properties. This means that what we need to parse is the element content. This is the job of the characters() method, where our variables playerPoints and playerRebounds will be loaded. This is why, at the moment a ‘points’ or ‘rebound’ element is found, we set our flags to 1. Here is how our startElement() method looks like:
def startElement(self, name, attrs):
if name == 'player':
self.playerName = attrs.get('name',"")
self.playerAge = attrs.get('age',"")
self.playerHeight = attrs.get('height',"")
elif name == 'points':
self.isPointsElement= 1;
self.playerPoints = "";
elif name == 'rebounds':
self.isReboundsElement = 1;
self.playerRebounds = "";
return
In the endElement() method we finally do the comparison of our search term with the value of the ‘name’ property. If they match, then we print our output. You can format this output anyway you like. This is also the proper place to re-set our flags, before the parser moves to the next element.
Here is how our endElement() method looks:
def endElement(self, name):
if name == 'points':
self.isPointsElement= 0
if name == 'rebounds':
self.inPlayersContent = 0
if name == 'player' and self.searchTerm== self.playerName :
print '<h2>Statistics for player:' , self.playerName, '</h2><br>(age:', self.playerAge , 'height' , self.playerHeight , ")<br>"
print 'Match average:', self.playerPoints , 'points,' , self.playerRebounds, 'rebounds'
The characters() method is invoked whenever a chunk of character data is found. Here is the place we use the flags set in our startElement() method; when they have the value of ‘1’ we load our variables with the data. Please note that all our character data are not necessarily returned in a single call. The function may split it in more than one chunks.
Here is how our characters() method looks:
def characters (self, ch):
if self.isPointsElement== 1:
self.playerPoints += ch
if self.isReboundsElement == 1:
self.playerRebounds += ch
So, this is it with the basic structure of our application! If you remember, this script will be called from a Web form and our search term is in the field ‘playerName’ of this form. The following part is the ‘main’ code that does this job and uses the methods defined earlier.
Next: The Main Code >>
More XML Articles
More By Nadia Poulou