That is the question posed by Paul Duguid, a professor at UC Berkeley, the University of London and Santa Clara University, about the Google Books Project. His article, “Inheritance and loss? A brief survey of Google Books” was just published in First Monday, a peer-reviewed online journal about the Internet.
Duguid’s point is that the Google Books project will really outstrip most other projects to digitize cultural artifacts, making them “appear inept or inadequate.” But the authority and quality of the Google project, Duguid argues, is based on a kind of inheritance from the reputation of the libraries involved. So Duguid sets out to see if Google really is the qualitative heir of Harvard and Stanford.
His results are disheartening. His search for a deliberately unconventional book, Sterne’s “Tristram Shandy,” returns results likely to confuse and discourage a casual reader. The first result on Google’s results list, a copy from Harvard, is so badly scanned that it is virtually illegible, with words cut off by the gutter on nearly every line. Elsewhere the text fades to indecipherable scratchings. And some of Sterne’s eccentricities are missing; the black page of mourning for the dead Parson Yorick simply is not included in the Google scan. When Duguid tries the second result from his search, things get worse. The first page of the scan is blank and the second page puts the reader at the end of chapter 0ne and the beginning of chapter 2 — of the second volume. Nothing informs the reader (other than comparison with a printed text) that they have been plunged into the middle of the book.
Duguid’s judgments on Google Books are harsh: the project ignores essential metadata like volume numbers, the quality of the scans are often inadequate, and sometimes editions that are best consigned to oblivion are given undeserved prominence for no discernible reason (that is his conclusion regarding the second text he found, from Stanford). Rather than inheriting quality from Harvard and Stanford, he concludes, “Google threatens not only its own reputation for quality and technological sophistication, but also those of the institutions that have allied themselves to the project.”
It is true that the real value of the Google Books Project is not so much to find reading matter for people as to direct them to which books are most likely to be of help or interest to them. Few people, one presumes, will try to read “Tristram Shandy” in the Google Books format. But the failures of visual quality and metadata control threaten even the more modest view of Google Books as a giant index. Without a higher degree of quality than Duguid discovered, it is hard to argue that Google is superior in any way to a comprehensive online catalog from a major library