An era ended with an email that I got a few weeks ago from the Library of Congress.
Maybe I’m being a bit dramatic, but the LoC informed us that its venerable American Memory site would no longer include records and links to Historic American Sheet Music and Emergence of Advertising in America, the two digital collections that Duke University Libraries built with grants from the LoC and Ameritech ca. 1996-1998. Since then, the email explained,
[T]he Internet has changed significantly. Search engines have dramatically improved; users have come to expect that the most relevant content to their search query will be found regardless of its location on the web. Users no longer rely on browsing through aggregated directories of content but instead find discrete pages via searching and following related links. In the environment dominated by search engines, duplication can detract from an item’s findability, rather than enhance it.
While we’ll miss the juicy web stats that we got from American Memory referrals, it’s hard to argue with the message’s logic, its summary of user expectations, and the desire of the LoC to simplify its architecture and remove dependencies on the bit-rotting sites of the original Ameritech grant recipients. Still, the end of a long-term relationship tends to make one reflective. It got me thinking about the history of digital collections in libraries, and how we in the field have passed through two distinct ages, roughly a decade each, and now enter a third which, to some extent, is ours to make.
The early American Memory project belongs to the Age of Discovery for library digital collections. By the early 1990s, many libraries were engaged in some form of electronic capture of their unique and distinctive collections, with plans to make them available via emerging digital technologies. It was already new ground to explore, and would soon open up more than anyone imagined.
American Memory itself began with The Library of Congress as a CD-ROM project. As the Washington Post reported in 1991*, the pilot phase of the project would distribute the CDs to historical societies, public libraries, and junior high school libraries “to help focus the prototype.” The Post article spent less space describing the content to be included on the CD-ROMs than it did the technology itself, and the advantages of digitized archives over physical for the researcher. The ubiquity of personal computers was still a novel enough condition that it seemed newsworthy for a library to use them to extend its reach.
Then the World Wide Web happened, and libraries made the very natural leap from distribution via CD-ROMs and other physical media to using the Internet. The Library of Congress shifted immediately to the new strategy, partnering with various donors, including Ameritech (see Wikipedia for an account of that corporation’s own fluid history), to fund digitization efforts around the country. It now described American Memory as a “five-year, $ 60-million plan to establish a National Digital Library” that “would link historical societies and libraries nationwide and worldwide so that the public would have access to vital historical documents, artifacts and images.” **
Duke received an Ameritech grant in the first year of the program, 1996-7, for Historical American Sheet Music, then a second the next year for Emergence of Advertising in America. These projects followed on the debut of the Duke Papyrus Archive, still hosted here in all its 1995 static HTML glory. A 1998 New York Times article quotes Steve Hensen, then an archivist in our Rare Book, Manuscript, and Special Collections Library (now Rubenstein), capturing the essence of the Age of Discovery for library digital collections.
While Duke was photographing and cataloguing its [papyrus] collection, [Hensen] said, “the World Wide Web burst upon the scene, and we had this flash: Why make photos? Why not scan them and see if we can put them up on the Web and make them accessible worldwide?”***
The WWW “burst upon the scene” while libraries and archives were busy thinking of ways to use personal computers to provide access to electronic versions of their collections. It brought about a decade of experimenting with pipelines from the stacks to flatbed scanners and on to the Web.
Beginning in the mid-2000s, libraries and cultural heritage organizations went about summarizing their lessons learned, and shifting their attention to digitization at scale. We’ll call this the Industrial Revolution for digital collections; its buzzphrase has been “mass digitization.” We can mark its ascendance as 2005, with the publication of the influential paper, “More Product, Less Process.”**** While the main focus for Greene and Meissner was the processing backlogs that exist in archives, digitization played a prominent role in their discussion. The idea that archives should process faster and with less attention to detail informed OCLC’s “Shifting Gears,” a series of provocative statements on digitization that urged “programs not projects,” access over preservation, and quantity over quality.
As it happens, 2005 was the same year our Digital Production Center came into being. I think if you browse through the archives of this blog, you’ll find numerous posts that show a digital collections program maturing at the tail end of this Industrial Revolution. While we might bend your ear about some areas where we could use more staff support, still we find ourselves, almost a decade later, with a vastly expanded capacity to digitize and publish the Duke University Libraries’ unique, rare, and distinctive collections.
So what’s next?
I think our primary audience – the researchers who need and use our collections – present us with two trends that should guide our approach. First, their use and demand of digitized primary sources for their work is growing. Second, they also use and demand an expanding range of methods and tools for working with these sources.
I plan to go into more detail on these trends in a later post. But I think they are clear, and present libraries and other organizations with a critical challenge. The activities in which researchers engage will largely fall within the scope of what we either provide or enable. The choices that we make about what we digitize, and how we provide access to it, will circumscribe the boundaries of research in the humanities and other fields for decades to come. That’s a big responsibility.
We need to work with our constituents to understand better how our digitization efforts impact research and learning. We need to plan, and to be deliberate and transparent in our choices. The previous two ages of digital collections happened mostly in the conference rooms and internal communications of the library, but this next one should be different. The present situation calls for an Age of Engagement.
* Mendel-Black, D. (1991, November 11). Putting a Nation’s Story Within Fingertip Reach; Library of Congress Taps CD-ROM’s. Washington Post.
** Snider, M. (1997, April 10). Research archives in cyberspace Library of Congress grants help libraries put treasures on line. USA Today.
*** Thomas, J. (1998, November 29). Digitized Artifacts Are Making Knowledge Available to All, on Line. New York Times.
**** Greene, Mark A.; Dennis Meissner (2005). “More Product, Less Process: Revamping Traditional Archival Processing”. American Archivist 68: 208–263.
***** Erway, Ricky, and Jennifer Schaffner. 2007. “Shifting Gears: Gearing Up to Get Into the Flow”.