Researchers have created a new approach to scholarship, using approximately 4 percent of all books ever published as a digital "fossil record" of human culture, it has been stated in the latest issue of the journal Science. By tracking the frequency with which words appear in books over time, scholars can now precisely quantify a wide variety of cultural and historical trends, it says. The article further describes the four-year effort, led by Harvard University's Jean-Baptiste Michel and Erez Lieberman Aiden.
The team comprises researchers from Harvard, Google, Encyclopaedia Britannica and the American Heritage Dictionary. It has already used its approach, called 'culturomics', by analogy with genomics, to gain insight into topics as diverse as humanity's collective memory, the adoption of technology, the dynamics of fame, and the effects of censorship and propaganda.
Google will release a new online tool to accompany the paper - a simple interface that will enable users to type in a word or phrase and immediately see how its usage frequency has changed over the past few centuries.
This dataset, which is available for download, is based on the full text of about 5.2 million books, with more than 500 billion words in total. About 72 percent of its text is in English, with smaller amounts in French, Spanish, German, Chinese, Russian and Hebrew.
It is claimed to be the largest data release in the history of the humanities. The authors note a sequence of letters 1,000 times longer than the human genome. If written in a straight line, it would reach to the moon and back 10 times over.
The paper describes the development of this new approach and surveys a vast range of applications, focusing on the past two centuries.
The work was funded by Google, a Foundational Questions in Evolutionary Biology Prize Fellowship, Harvard Medical School, the Harvard Society of Fellows, a Fannie and John Hertz Foundation Graduate Fellowship, a National Defense Science and Engineering Graduate Fellowship, a National Science Foundation Graduate Fellowship, the National Space Biomedical Research Institute, the National Human Genome Research Institute, the Templeton Foundation, the National Institutes of Health, and the Bill and Melinda Gates Foundation.
Search for more Research Support tools
To access our daily STM news feed through your iPhone, iPad, or other smartphones, please visit www.myscoope.com for a mobile friendly reading experience.