In this article, you'll create a graphical report from Google News RSS data, using a handy utility called FeedTools and a plug-in called CSS Graphs Helper. This article is excerpted from chapter 11 of the book Practical Reporting with Ruby and Rails, written by David Berube (Apress; ISBN: 1590599330).
Due to the Internet, we live in a world with easy access to all types of information. Even local newspapers that were formerly inaccessible outside their locality are publishing stories online. As a result, you can catch up on localized news from all over the world. This is a significant step forward, of course, since it means you can get news about, say, Brazil, straight from the source, instead of from an Associated Press reporter who may have been in the country for only a few days. The downside is that there's an explosion of news sites --some good, some bad, and some mediocre. They are in such quantity that they can be hard to sift through, so it's difficult to extract the particular information you want from the mass of information you can access.
Fortunately, tools are available to help with the task of news organization. For example, just as Google web search makes searching for web sites easier, Google News makes searching news easier. Google News aggregates news from all over the world and lets you filter by useful constraints, such as keywords and dates. In fact, Google News can even eliminate duplicate stories (a result of news syndication companies such as the Associated Press and United Press International selling stories to dozens or hundreds of newspapers).
You can use Google News to track news topics easily and quickly. One way is to find news manually via the web interface at http://news.google.com/. Another approach is to use the Google News Really Simple Syndication (RSS) interface. Used in conjunction with a programming language such as Ruby, this interface allows you to manage news aggregation in ways limited only by the boundaries of your imagination.
In this chapter, you'll create a graphical report from Google News RSS data. To parse the data for this example, you'll use a handy utility called FeedTools, which we'll look at first. To create the graphs, you'll use a plug-in called CSS Graphs Helper. This is an easy-to-use tool for creating simple HTML charts, as you'll see when you create the Rails application later in the chapter.
Using FeedTools to Parse RSS
Google News provides its data in RSS form, which is an XML format, so you could parse it using a Ruby library like the standard REXML or the XmlSimple or Remarkably gems (both introduced in Chapter 9). However, FeedTools gives you the advantage of a powerful interface specific to news feeds, which makes your life much easier.
For example, here's how easy it is to print out the titles from the RubyForge news feed, which lists all the new software released on RubyForge:
-------------------------------------------- Net::NNTP Client Library:SCM is now Subversion rb-appscript 0.5.0 released Open Ruby on Rails Book:openrorbook Download Issues Duration 0.1.0 released votigoto 0.2.1 Released Sequel 0.4.4.2 Released --------------------------------------------
The second line creates a new FeedTools:Feed object using the open method. The URL specified is http://rubyforge.org/export/rss_sfnews.php, which is the RSS feed for RubyForge. The next line uses the items method of the feed and calls its each method to iterate through each feed item, and then the title method of each item is used to print the item titles. You can access other attributes of each item, such as the URL of the full view of the item, the date it was updated, and so forth. If it's included in the RSS feed, the full text of an item is available through the description method.
FeedTools can also parse Atom and Channel Definition Format (CDF) feeds, as well as generate news feeds in RSS, Atom, or CDF form. You can find out more about FeedTools at its home page: http://sporkmonger.com/projects/feedtools/.