For me, one sign of a really good book is that I learn things I wasn’t expecting to learn. I had that experience while reading almost every chapter of Uncharted: Big Data as a Lens on Human Culture. The book is written by the creators of Google’s Ngram Viewer, which is a tool that shows the frequency of any word or phrase (single words are 1-grams, 2-word phrases are 2-grams…) in the massive and continually growing corpus of books in the Google Books database. The most informative feature of Ngram Viewer is that you can compare frequencies of different phrases to each other and see changes in their use over time (here’s a holiday phrase comparison that I made.).
The book includes many ngram comparisons that are much more informative than mine. It tells the story of the Ngram Viewer’s birth, shows lots of interesting ngram comparisons, and goes more in depth on a variety of uses. Maybe the most surprising use is that ngrams can reflect censorship efforts. By looking at the slopes of the changes in frequency for different people’s names during the Nazi regime, it becomes clear that some names were being censored (those ngrams have negative slopes for that time period) and others were rising in prominence (those have positive slopes). When compared with historical records, the ngram-based conclusions are strikingly accurate.
The book only shows a tiny slice of what the Ngram Viewer can be used to learn. It’s the epitome of cognitive science, piecing together wisdom from many disciplines. Ngram Viewer is a great tool, whether you’re at home on the couch wondering when the phrase “Merry Christmas” became popular, or doing paid research, and this book was a cool way to learn more about it.