Java
  Home arrow Java arrow Page 2 - Crawling the Semantic Web
Dev Articles Forums 
ADO.NET  
Apache  
ASP  
ASP.NET  
C#  
C++  
ColdFusion  
COM/COM+  
Delphi-Kylix  
Design Usability  
Development Cycles  
DHTML  
Embedded Tools  
Flash  
Graphic Design  
HTML  
IIS  
Interviews  
Java  
JavaScript  
MySQL  
Oracle  
Photoshop  
PHP  
Reviews  
Ruby-on-Rails  
SQL  
SQL Server  
Style Sheets  
VB.Net  
Visual Basic  
Web Authoring  
Web Services  
Web Standards  
XML  
Mobile Linux 
App Generation ROI 
IBM® developerWorks 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
JAVA

Crawling the Semantic Web
By: No Starch Press
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 5 stars5 stars5 stars5 stars5 stars / 3
    2006-02-23

    Table of Contents:
  • Crawling the Semantic Web
  • This Somethings That: A Short Introduction to N3 and Jena
  • Triple the Fun: Creating an RDF Vocabulary for Your Organization
  • Who’s a What? Using RDF Hierarchies in Jena
  • Getting Attached: Attaching Dublin Core to HTML Documents
  • What’s the Reason? Making Queries with Jena RDQL

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Crawling the Semantic Web - This Somethings That: A Short Introduction to N3 and Jena


    (Page 2 of 6 )

    The theory behind the RDF standard is actually quite simple. Everything has a Uniform Resource Identifier (URI), and by this I mean everything: not only documents but also generic concepts and relationships between them. Even though you are not a document (or are you?), there could be a URI assigned to represent you as an entity. This URI can then be used to make connections to other things. For the “you” URI, these connections might represent related organizations, addresses, and phone numbers. URIs do not have to return an actual document! This is what sometimes confuses developers when they see a URI referenced somewhere and find that there is nothing at the location. These addresses are often used as markers or unique identifiers to represent concepts. We make links between URIs to represent relationships between things. This functions much like a simple sentence in English:

    Programmers enjoy Java.

    To begin with, let’s use a shorthand notation, called N3, to encode this as an RDF graph. N3 is an easy way to learn RDF because the syntax is only slightly more complex than the sentence above! In essence, N3 is merely a set of triples, or “subject predicate object” relationships. Here is the N3 version of the sentence:

    @prefix wcj: <HTTP://EXAMPLE.ORG/WCJAVA/URI/> .
    wcj:programmers wcj:enjoy wcj:java .

    We first define a prefix to make the N3 code less verbose. The prefix is used as the beginning part of a URI wherever it is found in the document, so that wcj:java then becomes http://example.org/wcjava/uri/java (the value is also placed within < and > markers—these have nothing to do with XML). The three items together are called a triple, and the verb is usually called a predicate. RDF makes a link by stating that a subject URI is related by a predicate URI to an object URI. The predicate represents some relationship between the subject and object—it tells how things link together. This is very different than an anchor in HTML, because here a relationship type is clearly defined. Remember that URIs in RDF could be anything: concepts, documents, or even (in some cases) String literals. In theoretical terms, we are creating a labeled directed graph of the relationship. A graph representation of the above might look like Figure 4-1.


    Figure 4-1:
    RDF subject, predicate, and object

    As you might expect, there is a Java API for creating and managing RDF and N3 documents. Jena is an open-source API for working with RDF graphs. Here is one way to create the graph in Jena and serialize it to an N3 document:

    import com.hp.hpl.jena.rdf.model.*;
    import java.io.FileOutputStream;

    Model model = ModelFactory.createDefaultModel();
    Resource programmers = model.createResource(
         "http://example.org/wcjava/uri/programmers");
    Property enjoy = model.createProperty(
         "http://example.org/wcjava/uri/enjoy");
    Resource java = model.createResource(
         "http://example.org/wcjava/uri/java");
    model.add(programmers, enjoy, java);
    FileOutputStream outStream = new FileOutputStream("out.n3");
    model.write(outStream, "N3");
    outStream.close();

    Here, Jena is using the term property to refer to the predicate and resource to refer to something used as a subject or object. The model’s write method also has options to write out the document in other formats besides N3. With the Jena API, you can connect many entities together into very large semantic networks. Let’s make some additional relationships using the entities and relationships that we just created. We will produce the graph shown in Figure 4-2. 


    Figure 4-2:
    An RDF graph with mulitple subjects

    Here is the additional code to produce the network in Figure 4-2:

    Property typeOf = model.createProperty(
       "http://example.org/wcjava/typeOf");
    Property use = model.createProperty(
       "http://example.org/wcjava/use");
    Property understand = model.createProperty(
       "http://example.org/wcjava/understand");
    Resource computers = model.createResource(
       "http://example.org/wcjava/computers");
    Resource progLang =model.createResource(
       "http://example.org/wcjava/progLang");
    model.add(java, typeOf, progLang);
    model.add(programmers, use, computers);
    model.add(computers, understand, progLang);
    model.write(new java.io.FileOutputStream("out2.n3"), "N3");

    The N3 output of this code is the following:

    <HTTP://EXAMPLE.ORG/WCJAVA/URI/JAVA>
      <HTTP://EXAMPLE.ORG/WCJAVA/TYPEOF>
        <HTTP://EXAMPLE.ORG/WCJAVA/PROGLANG> .

    <HTTP://EXAMPLE.ORG/WCJAVA/COMPUTERS>
      <HTTP://EXAMPLE.ORG/WCJAVA/UNDERSTAND>
        <HTTP://EXAMPLE.ORG/WCJAVA/PROGLANG> .

    <HTTP://EXAMPLE.ORG/WCJAVA/URI/PROGRAMMERS>
      <HTTP://EXAMPLE.ORG/WCJAVA/URI/ENJOY>
        <HTTP://EXAMPLE.ORG/WCJAVA/URI/JAVA>
    ;
     
    <HTTP://EXAMPLE.ORG/WCJAVA/USE>
        <HTTP://EXAMPLE.ORG/WCJAVA/COMPUTERS> .

    The semicolon in the N3 document is a shortcut that indicates we are going to attach another property to the same subject (“programmers enjoy java, and programmers use computers”). The meanings of elements within a document are often defined in terms of a predefined set of resources and properties called a vocabulary. Your RDF data can be combined with other data in existing vocabularies to allow semantic searches and analysis of complex RDF graphs. In the next section, we illustrate how to build upon existing RDF vocabularies to build your own vocabulary.

    More Java Articles
    More By No Starch Press


       · This article is an excerpt from the book "Wicked Cool Java," published by No Starch...
     

    Buy this book now. This article is excerpted from the book Wicked Cool Java, written by Brian D. Eubanks (No Starch Press, 2005; ISBN: 1593270615). Check it out today at your favorite bookstore. Buy this book now.

    JAVA ARTICLES

    - Deploying Multiple Java Applets as One
    - Deploying Java Applets
    - Understanding Deployment Frameworks
    - Database Programming in Java Using JDBC
    - Extension Interfaces and SAX
    - Entities, Handlers and SAX
    - Advanced SAX
    - Conversions and Java Print Streams
    - Formatters and Java Print Streams
    - Java Print Streams
    - Wildcards, Arrays, and Generics in Java
    - Wildcards and Generic Methods in Java
    - Finishing the Project: Java Web Development ...
    - Generics and Limitations in Java
    - Getting Started with Java Web Development in...







    © 2003-2010 by Developer Shed. All rights reserved. DS Cluster 8 Hosted by Hostway
    For more Enterprise Application Development news, visit eWeek