When I saw the latest in the Lord of the Rings trilogy of movies a short while ago, I wondered how Tolkien had invented the artificial languages of Middle Earth. In my previous article, I told of my desire to discover which real language had been the biggest influence on Tolkien for his invented ones. As a software developer, I wanted to discover this information algorithmically. My idea was to use my own string similarity algorithm to compare each word from a list of Tolkien words to words from 14 other real languages. For each Tolkien word, I would find and record the language with the word that is (lexically) most similar. The set of most-similar words and the languages from which they came would provide new insights into the influences on Tolkien.
Lord Of The Strings Part 2 - Differences on the Table (Page 7 of 7 )
Computing the expected number of hits is the statistical equivalent of a control experiment in the physical sciences - in other words, what would be the outcome if there were no interesting behavior to study? You might think such an exercise to be of little use, as we're pretty sure there is something interesting going on. The point is that we can now compare the actual results that we obtained to the expected results, and look at the differences. The following table orders the languages according to the size of the difference between actual and expected number of hits.
Now we're getting somewhere. The languages with differences that are greater than zero may have had an influence on Tolkien. Furthermore, the size of the difference is also an indication of the level of influence. So we're beginning to see that Tolkien's mother tongue of English seems to have had the most profound influence on him. My reaction to this was, at first, one of surprise, then of reassurance. I was surprised because of the apparent dissimilarity of Tolkien's invented words to English, and the fact that the Tolkien words matched only three English words exactly. But I was reassured because Tolkien was, after all, English, and you would expect him to be heavily influenced by his native language.
Note also the particularly strong result for Hungarian, which received nearly three times as many hits as expected. Finnish performed almost exactly as expected, indicating no appreciable influence on Tolkien, whilst French and German performed well under expectations, perhaps indicating that Tolkien was deliberately avoiding the influences of these languages. Conclusions
When I started this investigation, I had no idea what the result would be. I just clung firmly onto the belief that my string similarity metric, together with a simple algorithm to iterate over the set of possible word pair comparisons, would provide an interesting result. In fact, the results are very satisfying. I found that English had a profound effect on Tolkien's invented languages, with perhaps further influences from Hungarian and Spanish. This is satisfying because it is entirely reasonable (at least the part about English!), though not exactly what I expected after reading about the (apparently unfounded) claims for the influences of Finnish. It is also satisfying because it increases my confidence in the string similarity method. And as developers, we like to have confidence in our methods.
DISCLAIMER: The content provided in this article is not warranted or guaranteed by Developer Shed, Inc. The content provided is intended for entertainment and/or educational purposes in order to introduce to the reader key ideas, concepts, and/or product reviews. As such it is incumbent upon the reader to employ real-world tactics for security and implementation of best practices. We are not liable for any negative consequences that may result from implementing any information covered in our articles or tutorials. If this is a hardware review, it is not recommended to open and/or modify your hardware.