When offering content on a website, many webmasters find that they must consider the context of that content. Will it be offered as HTML? A PDF document? Or do you want to let your visitors choose? If you choose the last option, how do you avoid having to redo the entire document by hand in each format? XML lets you generate context-specific representations of rich content sources through both modular construction and data transformation. This article is taken from chapter one of XML Publishing with AxKit by Kip Hampton (O'Reilly, 2004; ISBN 0596002165).
XML as a Publishing Technology - Dynamic Server-Side Transformations (Page 5 of 5 )
In the server-side runtime processing model, all XML data is parsed and then transformed on the server machine before it is delivered to the client. Typically, when a request is received, the web server calls out via a server extension interface to an external XML parser and stylesheet processor that performs any necessary transformations on the data before handing it back to the web server to deliver to the client. The client application is expected only to be able to render the delivered data, as shown in Figure 1-4.
Figure 1-4.The server-side processing model
Handling all processing dynamically on the server offers several benefits. It is a given that a scripting engine or other application framework will be called on to process the XML data. As a result, the same methods that can be used from within that framework to capture information about a given request (HTTP cookies, URL parameters, POSTed form data, etc.) can be used to determine which transformations occur and on which documents. In the same way, access to the user agent and accept headers gives the developer the opportunity to detect the type of client making the connection and to transform the data into the appropriate format for that device. This ability to transform documents differently, based on context, provides the dynamic server-side processing model a level of flexibility that is simply impossible to achieve when using the client-side or preprocessed approaches.
Server-side XML processing also has its downside. Calling out to a scripting engine, which calls external libraries to process the XML, adds overhead to serving documents. A single transformation from Simplified DocBook to HTML may not require a lot of processing power. However, if that transformation is being performed for each request, then performance may become an issue for high traffic sites. Depending on the XML interface used, the in-memory representation of a given document is 10 times larger than its file size on disk, so parsing large XML documents or using complex stylesheets to transform data can cause a heavy performance hit. In addition, choosing to keep the XML processing on the server may also limit the number of possible hosting options for a given project. Most service providers do not currently offer XML processing facilities as part of their basic hosting packages, so developers must seek a specialty provider or co-locate a server machine if they do not already host their own web servers.
Comparing these three approaches to publishing XML content, you can generally say that dynamic server-side processing offers the greatest flexibility and extensibility for the least risk and effort. The cost of server-side processing lies largely in finding a server that provides the necessary functionality—a far more manageable cost, usually, than that of working around client-side implementations beyond your control or writing custom offline processing tools.
Introducing AxKit, an XML Application Server for Apache
Originally conceived in 2000 by Matt Sergeant as a Perl-powered alternative to the then Java-centric world of XML application servers, AxKit (short for Apache XML Toolkit) uses the mod_perl extension to the Apache HTTP server to turn Apache into an XML publishing and application server. AxKit extends Apache by offering a rich set of server configuration directives designed to simplify and automate common tasks associated with publishing XML content, selecting and applying transformative processes to XML content to deliver the most appropriate result.
Using AxKit’s custom directives, content transformations (including chains of transformations) can be applied based on a variety of conditions (request URI, aspects of the XML content, and much more) on a resource-by-resource basis. Among other things, this provides the ability to set up multiple, alternate styles for a given resource and then select the most appropriate one at runtime. Also, by default, the result of each processing chain is cached to disk on the first request. Unless the source XML or the stylesheets in the chain change, all subsequent requests are to be served from the cache. Figure 1-5 illustrates the processing flow for a resource with one associated processing chain consisting of two transformations.
Figure 1-5.Basic two-stage processing chain
In its design, AxKit implements a modular system that divides the low-level tasks required for serving XML data across a series of swappable component classes. For example, Provider classes are responsible for fetching the sources for the content and stylesheets associated with the current request, while Language modules implement interfaces to the various transformative processors. (You can find details of each type of component class in Chapter 8.) This modular design makes AxKit quite extensible and able to cope with heterogeneous publishing strategies. Suppose that some content you are serving is stored in a relational database. You need only swap in a Provider class that selects the appropriate data for those pages from the database, while still using the default filesystem-based Provider for static documents stored on the disk. Several alternative components of various classes ship with the core AxKit distribution, and many others are available via the Comprehensive Perl Archive Network. Often, little or no custom code needs to be written. You simply drop in the appropriate component and configure its options.
We will look at each AxKit option for creating style processing chains in depth in Chapter 4. But for now, recall the collection of poems that you marked up using the poemsfrag Document Type Definition earlier in this chapter. Also, remember that when you left off, you were a bit stuck: the poems’ markup captured the content in a semantically meaningful way, but by abandoning HTML as the source grammar, you lost the ability to just upload the document to a web server and expect that browsers would render it properly. This is precisely the type of task that AxKit was designed to address. Figure 1-6 illustrates a single source document containing a poem and three alternative processing chains implemented as named styles that can be selected at run-time to render that poem in various formats.
Figure 1-6.Alternate style chains
Here is a sample configuration snippet that would implement these styles, making each selectable by adding a style parameter with the appropriate value to the request’s query string:
<Directory /poems> <Files *.xml> # choose styles based on the query string AxAddPlugin Apache::AxKit::StyleChooser::QueryString # renders the poem as HTML <AxStyleName poem_html> AxAddProcessor text/xsl /styles/poem2html.xsl </AxStyleName> # generates the poem as PDF <AxStyleName poem_pdf> AxAddProcessor text/xsl /styles/poem2fo.xsl AxAddProcessor application/x-xsl-fo NULL </AxStyleName> # extracts the metadata from the poem and renders it as RDF <AxStyleName poem_rdf> AxAddProcessor text/xsl /styles/poem2rdf.xsl </AxStyleName> # set a default style if none is passed explicitly AxStyle poem_html </Files> </Directory>
With this in place, you can put your XML documents that use the poemsfrag grammar into the poems directory and render each poem in one of three formats. For example, a request to http://that.host/poems/mypoem.xml?style=poem_pdf returns the selected poem as a PDF document. A request for the same poem with style=poem_rdf in the query string offers the metadata about the selected poem as an RDF document. In each case, the source document does not change. Only the styles applied to its contents differ.
Finally, it worth noting here that AxKit is an officially sanctioned Apache Software Foundation (ASF) project. This means that AxKit is not an experimental hobby-ware project. Rather it is a battle-tested framework developed and maintained by a community of committed professional developers who need to solve real-world problems. No project of any size is entirely bug-free, but AxKit’s role as an ASF-blessed project means, at the very least, that it is held to a high standard of excellence. If something does go wrong, its users can fully expect an active community to be around to address the problem, both now and in the future.
This article is excerpted from XML Publishing with AxKit by Kip Hampton (O'Reilly, 2004; ISBN 0596002165). Check it out at your favorite bookstore today. Buy this book now.
DISCLAIMER: The content provided in this article is not warranted or guaranteed by Developer Shed, Inc. The content provided is intended for entertainment and/or educational purposes in order to introduce to the reader key ideas, concepts, and/or product reviews. As such it is incumbent upon the reader to employ real-world tactics for security and implementation of best practices. We are not liable for any negative consequences that may result from implementing any information covered in our articles or tutorials. If this is a hardware review, it is not recommended to open and/or modify your hardware.