Tuesday, September 20, 2011

Ngrams + lots of books

This is so cool. An n-gram is a sequence of n items. Google Labs applied the ngram to their colossal book database (15 million!), and has given us access to the results. It allows for a fascinating glimpse at our cultural history via the trends found in the data. If you want to dig right in, you can go to the Google Labs Ngram viewer. Or you can get a better sense of how it works here, or with this video from TED Talks:

Here are a few samples I ran through the gears: liberal, conservative; religion, science; evolution, creation; compassion, responsibility; jesus, santa -- but oops! be careful... this thing is case sensitive... try that one again: Jesus, SantaBeethoven, Beatlesjazz, bluesfootball, baseball, basketball, soccer, cricket, hockey, rugby

