Home arrow Java arrow Page 3 - Crawling the Semantic Web, concluded

Crawling the Semantic Web, concluded

This article, the second of two parts, examines the problems raised by the glut of information available through the web, and how to tame it. It is excerpted from the book Wicked Cool Java, written by Brian D. Eubanks (No Starch Press, 2005; ISBN: 1593270615).

Author Info:
By: No Starch Press
Rating: 5 stars5 stars5 stars5 stars5 stars / 7
March 02, 2006
  1. · Crawling the Semantic Web, concluded
  2. · Guess What? Publishing RSS Newsfeeds with Informa
  3. · Whatís Up? Aggregating RSS Newsfeeds
  4. · Heading to the Polls: Polling RSS Feeds with Informa
  5. · All the News Fit to Print: Filtering RSS Feeds with Informa

print this article

Crawling the Semantic Web, concluded - Whatís Up? Aggregating RSS Newsfeeds
(Page 3 of 5 )

In the previous section, we used the Informa library to create RSS content, so that visitors with content aggregators can be automatically informed about updates to your site. Another great use of RSS within your site is displaying recent news related to your industry. You can get these newsfeeds from many sources, such as news sites, websites in your industry, and aggregator sites like Syndic8. Make sure to check whether the sites you are syndicating will allow you to incorporate items from their feeds into your site. Usually this is the case, but not always.

Letís start by reading items from a newsfeed and displaying them as text. Using Informa, reading an RSS feed is easy. You can populate the same ChannelBuilder object that we used in the previous section with data from an existing RSS feed. The FeedParser class has a parse method that returns a ChannelIF instance containing the channel data from the RSS feed. The RSS standards may be in a state of confusion, but the Informa API reads all of them and gives us a common object model for working with them.

import de.nava.informa.impl.basic.Channel;
import de.nava.informa.impl.basic.ChannelBuilder;
import de.nava.informa.impl.basic.Item;
import de.nava.informa.parsers.FeedParser;

ChannelBuilder builder = new ChannelBuilder();
String url = "http://wickedcooljava.com/updates.rss";
Channel channel = (Channel) FeedParser.parse(builder, url);
System.out.println("Description: " + channel.getDescription());
System.out.println("Title: " + channel.getTitle());
// using Java 5 syntax in this for loop
for (Object x : channel.getItems())
Item anItem = (Item) x;
  System.out.print(anItem.getTitle() + " - ");

This will print some basic information about the channel and its items. If you want to include these in a web page, itís now just a matter of wrapping HTML tags around the text. If you are including RSS files that are outside your control, you may want to filter data from the channels before displaying them. Weíll discuss this in a later section.

blog comments powered by Disqus

- Java Too Insecure, Says Microsoft Researcher
- Google Beats Oracle in Java Ruling
- Deploying Multiple Java Applets as One
- Deploying Java Applets
- Understanding Deployment Frameworks
- Database Programming in Java Using JDBC
- Extension Interfaces and SAX
- Entities, Handlers and SAX
- Advanced SAX
- Conversions and Java Print Streams
- Formatters and Java Print Streams
- Java Print Streams
- Wildcards, Arrays, and Generics in Java
- Wildcards and Generic Methods in Java
- Finishing the Project: Java Web Development ...

Watch our Tech Videos 
Dev Articles Forums 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Weekly Newsletter
Developer Updates  
Free Website Content 
Contact Us 
Site Map 
Privacy Policy 

Developer Shed Affiliates


© 2003-2018 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
Popular Web Development Topics
All Web Development Tutorials