How Google Is Solving Its Book Problem 58
Pickens writes "Alexis Madrigal writes in the Atlantic that Google's famous PageRank algorithm can't be deployed to search through the 15 million books that Google has already scanned because books don't link to each other in the way that webpages do. Instead Google's new book search algorithm called 'Rich Results' looks at word frequency, how closely your query matches the title of a book, web search frequency, recent book sales, the number of libraries that hold the title, how often an older book has been reprinted, and 100 other signals. 'There is less data about books than web pages, but there is more structure to it, and there's less spam to contend with,' writes Madrigal. Yet the focus on optimizing an experience from vast amounts of data remains. 'You want it to have the standard Google quality as much as possible,' says Matthew Gray, lead software engineer for Google Books. '[You want it to be] a merger of relevance and utility based on all these things.'"