Internet resources: Goggle Books collection as an object of literary analysis

28 July 2010

Peter Leonard is a doctoral student in Scandinavian studies at the University of Washington. He and a partner, University of California (UCLA) professor Tim Tangherlini, have received $45,000 from Google to create tools for large-scale literary analysis through Google Books, part of nearly $1 million Google has committed to support digital humanities research over the next two years.

Their subject will be 160,000 Swedish, Danish and Norwegian texts that are part of the 12-million-volume Google Books collection.

Leonard, a Berkeley, Calif., native, took to Scandinavian languages when he was an undergraduate at the University of Chicago. Up to now, he has done his research on recent Swedish fiction the old-fashioned way; studying a few books intensively. Now he and his partner propose to move from microanalysis to macroanalyis, sifting through thousands of books for clues to human culture and development, looking for clues in the texts to how people of a certain time and place thought and lived.

"We might ask: What kinds of adjectives were used near female characters in 19th-century novels?" says Leonard. "What words were used to describe nature? You might be able to find interesting things about how people talked about the city, or the country. You can do this only if you have computers that can count the words and do mathematical calculations." Once the relevant books are identified, they can be read intensively for more clues.

A test project for Leonard and Tangherlini: analyzing books to show how folklore spread through 19th-century Scandinavian literature, a subject in which Tangherlini already has expertise. While Scandinavian language books are a small fraction of the Google Books corpus, the two hope to develop strategies that will be "germane and applicable to someone who's studying Italian literature, or the American literature of the South.