When I saw the latest in the Lord of the Rings trilogy of movies a short while ago, I wondered how Tolkien had invented the artificial languages of Middle Earth. In my previous article, I told of my desire to discover which real language had been the biggest influence on Tolkien for his invented ones. As a software developer, I wanted to discover this information algorithmically. My idea was to use my own string similarity algorithm to compare each word from a list of Tolkien words to words from 14 other real languages. For each Tolkien word, I would find and record the language with the word that is (lexically) most similar. The set of most-similar words and the languages from which they came would provide new insights into the influences on Tolkien.
Lord Of The Strings Part 2 - Discovering String Similarities (Page 3 of 7 )
I wrote a Java program that, for each Tolkien word, computes the most similar word and stores that value back into the database. I will explain how the program works without providing details about the JDBC database access (if you are interested in such details, try this short introduction, or a book such as Beginning Java Databases.) In fact, the details of the database access are also hidden from the main program, as it uses a database access class which hides its inner workings, and instead exposes the following interface:
public
interface QueryRunner { public void openConnection() throws SQLException; public ResultSet runQuery(String query) throws SQLException; public int runUpdate(String sql) throws SQLException; public void closeConnection() throws SQLException; }
As you can see there's a method for opening a database connection, methods for executing a query and performing a database update, and finally, a method for closing the connection to the database. It is a very simple interface, but sufficient for the task at hand.
The program also makes use of a small class that represents a word, its identifier, and the language from which it came:
class Word
{ private String word; private int id; private String lang;
public Word(String wd, int wdId, String language) { word = wd; id = wdId; lang = language; }
public String toString() { StringBuffer buf = new StringBuffer("Word["); buf.append(word); buf.append(","); buf.append(id); buf.append(","); buf.append(lang); buf.append("]"); return buf.toString(); } }