Google books, orphan works and academic values - Scholarly Communications @ Duke

I have long been impressed by the well-reasoned and articulate way in which Pamela Samuelson expresses her opposition to the Google Books Settlement. In the current issue of “Against the Grain,” a newsletter-style publication for librarians, publishers and vendors, Samuelson’s is, to my mind, the most persuasive of six short essays discussing the settlement. Unfortunately, there is no link available, since the current issue of “Against the Grain” is only available online for subscribers (which strikes me as pretty ironic). Nevertheless, lots of Samuelson’s writing on the settlement can be found linked to this page.

I want to make two points about her ATG essay.

First, I was struck by her remarks about the relationship between the GBS and potential solutions to the orphan works problem. Google, and at least one of the other ATG essayists, tout the settlement as an incentive for Congress to solve the persistent problem of making orphan works more usable. But as Samuelson points out, the Google settlement approach really hearkens back to a solution for orphan works — the creation of an escrow account to compensate owners who appear after the use in question has already been undertaken — that was considered and rejected some while ago.

The Library of Congress considered such an escrow model when it recommended an orphan works solution to Congress and decided against it. One obvious reason is that it establishes a financial barrier to using orphan works; such a barrier might reduce the problem of a “chilling effect” on such usage based on fear of litigation, but it would establish a new hurdle. Congressional proposals have instead followed the Library of Congress’ recommendation and focused, much more sensibly, in my opinion, on liability rules.

That Google thinks that an escrow model might pave the way for a full-blown orphan works solution suggests how completely in the thrall of commercial publishers they became once they abandoned the idea of defending fair use in court. They now have no place to go but toward subservience.

The second point I want to make based on Samuelson’s piece builds on her complaint that her objections to the settlement should not be addressed piecemeal, as Google has tried to do, but should be seen as reflecting a fundamental conflict between the GBS and the “cultural ecology of knowledge in academic communities.”I think she is right, and I particularly like her characterization of the system we tend to call scholarly communications. That system embodies a unique set of practices, values and incentives that are quite different from those ensconced in our traditional publication system. Indeed, the academic ecosystem is ill-served by an emphasis on the commercial value of knowledge and the subsequent drive to enforce an artificial scarcity on that knowledge. As Samuelson says, the academic ecosystem will be impaired by the Google settlement much more than it will be aided by the commercialization of out-of-print works.

I want to carry this point about an ecosystem a little further, and tie it to a theme I have written about before. Once we recognize a unique academic culture and business model for knowledge, that recognition should be carried over into the analysis of fair use in the academic environment. The second fair use factor, about the nature of the original work, is the obvious place to recognize why copyright works so poorly for academics, even if we concede that it still functions acceptably for Britney Spears. The Google settlement is clearly too sweeping and inclusive for this point to have a significant impact on it’s construction, although Samuelson is persuasive in arguing that it shows that the classes in this class action lawsuit are not not truly representative. But in situations where mostly academic works are at issue, like in the lawsuit against Georgia State University, courts should use the second fair use factor to explore the prevailing knowledge ecosystem in a comprehensive way. Such an examination of the scholarly communications system would result, inevitably, in a broader scope for fair use of academic works than might be available for works that are born and live out their useful lives (and then some!) in a commercial environment.

3 thoughts on “Google books, orphan works and academic values”

Pingback: Researching the Information Commons Group » Blog Archive » Google books, orphan works and academic values

While there is much to learn from Pam Samuelson’s brief article, I am also troubled by some of her argument. If her facts are wrong, can we accept her conclusions? Here are some examples:

1. She writes: “Most of the books that will be regulated by the settlement agreement are out-of-print books from the collections of major research libraries such as the University of California, and
most of these books were written by scholars for scholarly audiences.” I am not sure about her first “most” – I thought the settlement only applied to books that were in copyright and not commercially available.

It is the second part, though, that really troubles me. Any academic librarian involved in collection development knows that while most items are acquired to serve a scholarly audience, not all of them are written by scholars. And I know of no analysis that suggests that most of the books were “written by scholars.” The CMU random trial study, for example, found that “most of the books in the sample were published by commercial publishers,” and that “University presses and scholarly associations … published little of the content in the sample.” While academic authors often publish with trade presses, I no of no evidence that suggest that they constitute the bulk of the authors. And while Brian Lavoie’s analysis of the post-1923 titles in Worldcat found in Google libraries led him to “infer that it is likely intended for a scholarly or research-oriented audience,” nothing in his analysis of the 820,000 unique authors found in post-1923 US works could lead to the conclusion that most of them were academics. Furthermore, neither study considers the nature of authorship of foreign works, which would form a large percentage of the Google database.

There is no question that academic authors form a significant portion of the collections that have been digitized and that their interests may not be represented by the Author’s Guild. One should not, however, assert that most of the books in the database were written by scholars for scholarly audiences.

2. “The Financial Times has estimated the number of books likely to be orphans as between 2.8 to 5 million. These books will form a core part of the institutional subscription database to which my university and others are expecting to subscribe.”

The FT numbers are based in part on some preliminary work that I did. What has become clearer over time, however, is that the number of orphan works that will be included in the database is quite small, and shrinking with each change to the settlement. Michael Cairns, for example, subsequently estimated that there are 580,388 U.S. orphan works. The actual number covered by the settlement will be much lower because the settlement only applies to works that were registered with the Copyright Office (not published, which is what Cairns counted), and that do not have later editions or other complications that make them ineligible.

The initial settlement agreement held out the potential of making large number of orphan works available, including those works whose publication status was unclear. One of the great “accomplishments” of the critics has been to sharply restrict the nature of the final Google database by excluding these works, making it much less useful to scholars.

3. Samuelson and you bemoan the replacement of the proposed orphan works solution with an escrow model. The Copyright Office, however, as Jules Sigall repeatedly asserts, never intended the orphan works legislation to be a solution to the problem of mass digitization of material. The “reasonable investigation” approach might work if you are trying to put up a dozen books for a particular project, but not 15 million. So while I don’t particularly like the escrow approach, either, it does seem like a cost-effective approach to dealing with the problem that Google faced (and which the Copyright Office had avoided). I also think that the Settlement could in the long run work to everyone’s advantage if orphan works legislation does pass. After all, if an author is unwilling to come forward to claim her royalties from Google and the efforts of the Books Right Registry fail to locate her, that is a pretty good indication that the work is an orphan work. Google would be able to withdraw it from the settlement (and stop paying escrow fees) and libraries, after conducting a “reasonable search” by querying Google’s database, would be free to use it.

4. Like you, I do like Samuelson’s argument that copyright for academic works is different than for Disney and other commercial entities. I was recently re-reading Abraham L. Kaminstein’s 1961 report on copyright revision, and was struck the section on copyright notice. The report notes that “The notice requirement serves to place most of the great mass of published material in the public domain, while giving authors the opportunity to secure copyright when they want it. Most published material bears no notice, and is therefore in the public domain, because the author is not interested in securing copyright. This uncopyrighted material includes … scholarly, scientific, and other informational matter which the author is willing to make freely available for reproduction and circulation by anyone.” Kaminstein concludes: “We believe the public interest is served by keeping free of copyright restrictions the great bulk of published material in which the authors do not wish to secure copyright.” One of the great questions I have is how we moved from an environment in which it was assume that scholarly and scientific work did not need copyright protection to one where it is assumed that it does. I am not sure, however, that the Google Books Settlement is the right place to tackle the question of the scope and purpose of existing copyright law.

So as an opening on the question of the nature of scholarly publishing and authorship, Sameulson’s work is, as you suggest, thought-provoking. But as a discussion of the settlement, I much prefer Ivy Anderson’s first-rate piece in the same issue, and also found at the CDL blog.

I do like Samuelson’s argument that copyright for academic works is different than for Disney and other commercial entities.

Comments are closed.

Discussions about the changing world of scholarly communications and copyright