Parsing XML with SAX and Python - Our SAX Parser (Page 3 of 7 )
Now, let’s put the theory of the previous section in practice. Imagine that you have the statistics of the players of a basketball team in an XML document. Let’s say the name of the document is ‘playerStats.xml’. We will build a Python script that will take a player’s name as an input and then will search the document for the player’s statistics.
Let’s say that your XML document looks something like this:
As you can see, the data of the player is saved as attributes of the ‘player’ element while the match averages of points and rebounds are contents of an element.
Let’s make a Web form for the user to select one of the players. After clicking on the submit button, our script will parse the XML document return the user data together with his statistics for average points and rebounds.
Now I will explain the steps necessary for our script to do the job. At the end of this section you will find the complete code.
First of all, in our code, we need to import all Python modules we will use:
from xml.sax import make_parser from xml.sax.handler import ContentHandler import cgi
Since we’re dealing with a CGI script, don’t forget to put on top the path to the python executable, which is usually (on Unix systems):
#!/usr/bin/python
Please note that the above line may vary, depending on your system configuration. Contact your system administrator for more details in case of doubt.
Also do not forget to define the content type of the CGI before any other output, so that no browsers get confused on what the content type of the page is:
print "Content-Type: text/plain\n"
In your script you can use the print command in order to format the output of the script in any way you want. In the example script at the end of this page you will see that I used some basic formatting. Of course, if you want your page to look really good, go ahead and change what is provided here, but do not forget that for production applications it’s usually better to try and separate your logic from your HTML code.
Although, in production applications, you may even want to structure your logic in various packages and classes, instead of doing it all in one script, as shown here. Since this is not a software engineering course, I will not get into details here but return directly to our simple script.