Home arrow Java arrow Page 3 - Crawling the Semantic Web, concluded
JAVA

Crawling the Semantic Web, concluded


This article, the second of two parts, examines the problems raised by the glut of information available through the web, and how to tame it. It is excerpted from the book Wicked Cool Java, written by Brian D. Eubanks (No Starch Press, 2005; ISBN: 1593270615).

Author Info:
By: No Starch Press
Rating: 5 stars5 stars5 stars5 stars5 stars / 7
March 02, 2006
TABLE OF CONTENTS:
  1. · Crawling the Semantic Web, concluded
  2. · Guess What? Publishing RSS Newsfeeds with Informa
  3. · Whatís Up? Aggregating RSS Newsfeeds
  4. · Heading to the Polls: Polling RSS Feeds with Informa
  5. · All the News Fit to Print: Filtering RSS Feeds with Informa

print this article
SEARCH DEVARTICLES

Crawling the Semantic Web, concluded - Whatís Up? Aggregating RSS Newsfeeds
(Page 3 of 5 )

In the previous section, we used the Informa library to create RSS content, so that visitors with content aggregators can be automatically informed about updates to your site. Another great use of RSS within your site is displaying recent news related to your industry. You can get these newsfeeds from many sources, such as news sites, websites in your industry, and aggregator sites like Syndic8. Make sure to check whether the sites you are syndicating will allow you to incorporate items from their feeds into your site. Usually this is the case, but not always.

Letís start by reading items from a newsfeed and displaying them as text. Using Informa, reading an RSS feed is easy. You can populate the same ChannelBuilder object that we used in the previous section with data from an existing RSS feed. The FeedParser class has a parse method that returns a ChannelIF instance containing the channel data from the RSS feed. The RSS standards may be in a state of confusion, but the Informa API reads all of them and gives us a common object model for working with them.

import de.nava.informa.impl.basic.Channel;
import de.nava.informa.impl.basic.ChannelBuilder;
import de.nava.informa.impl.basic.Item;
import de.nava.informa.parsers.FeedParser;

ChannelBuilder builder = new ChannelBuilder();
String url = "http://wickedcooljava.com/updates.rss";
Channel channel = (Channel) FeedParser.parse(builder, url);
System.out.println("Description: " + channel.getDescription());
System.out.println("Title: " + channel.getTitle());
System.out.println("====================================");
// using Java 5 syntax in this for loop
for (Object x : channel.getItems())
{
 
Item anItem = (Item) x;
  System.out.print(anItem.getTitle() + " - ");
  System.out.println(anItem.getDescription());
}

This will print some basic information about the channel and its items. If you want to include these in a web page, itís now just a matter of wrapping HTML tags around the text. If you are including RSS files that are outside your control, you may want to filter data from the channels before displaying them. Weíll discuss this in a later section.


blog comments powered by Disqus
JAVA ARTICLES

- Java Too Insecure, Says Microsoft Researcher
- Google Beats Oracle in Java Ruling
- Deploying Multiple Java Applets as One
- Deploying Java Applets
- Understanding Deployment Frameworks
- Database Programming in Java Using JDBC
- Extension Interfaces and SAX
- Entities, Handlers and SAX
- Advanced SAX
- Conversions and Java Print Streams
- Formatters and Java Print Streams
- Java Print Streams
- Wildcards, Arrays, and Generics in Java
- Wildcards and Generic Methods in Java
- Finishing the Project: Java Web Development ...

Watch our Tech Videos 
Dev Articles Forums 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Contact Us 
Site Map 
Privacy Policy 
Support 

Developer Shed Affiliates

 




© 2003-2017 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
Popular Web Development Topics
All Web Development Tutorials