Building a Digital Collection One Step at a Time

Michael Adamo, Noah Huffman and Richard Murray

A visitor exploring one of the Duke Libraries’ digital collections is probably too engrossed in the content to think very much about how the collection got there. In fact, each digital collection is the product of a collaboration of eight to ten staff from several library departments who work together in a cross-functional team. The team begins each new project with a workplan and proceeds through a series of steps that culminates in the collection’s public launch.

The project workplan includes a timeline as well as statements of resource and staff requirements. Sometimes the workplan also calls for an investigation of copyright or other intellectual property issues.

photo of conservation lab

With the workplan complete, materials identified for inclusion in the digital collection go to the Libraries’ conservation lab where the staff assesses their physical condition and, if it is necessary, repairs or stabilizes them before they are scanned. The conservation treatment protects the materials from damage during digitization and preserves the physical items and their content to ensure their longevity after they’ve been scanned and returned to the stacks.

photo of digital production center

Following their stop in the conservation lab, materials are ready for digitization. Because the materials being digitized are often one-of-a-kind artifacts that may be in poor or delicate physical condition, it’s important that the electronic versions be of high quality, complete, and accurate representations of the physical object. Most digitization is done at Perkins Library in the Digital Production Center, where, depending on the items’ physical characteristics, they may be scanned on a flatbed scanner or photographed with an overhead camera. Some materials, such as film or audio recordings, may be digitized elsewhere if the appropriate equipment or expertise is not available in-house. Before the Digital Production Center staff releases the digital images, they do quality control, which includes everything from checking color accuracy to inspecting images for dust.

While it might seem that scanning and digitizing materials are the essence of what it takes to build a digital collection, this process is still just part of the initial phase of the project. The next step, assembling metadata, the information about the materials being digitized, is detailed and complex work that establishes the collection’s value to users. Without metadata, a 5000-item digital collection would be as difficult to use as 5000 photographs dumped on a tabletop.

Creating the metadata entails deciding what information to collect about the individual items in the collection, how to organize and describe each item, and what kind of terminology to use to lead people to the materials in the collection. Applying metadata can include adding captions to images, keywords to vintage advertisements, plot summaries to videos, and many other forms of description, as well as grouping similar objects into categories that users can browse. Archivists, catalogers, and other staff who provide metadata for digital collections employ the same skills they have always used to describe and arrange more traditional library materials.

While digitized items and metadata are crucial to building a successful digital collection, the collection’s user interface is an equally important element: What will users see when they view the collection on the Libraries’ website? How will they perceive and use the digital objects? To optimize the user’s experience, the production team works with librarians and other subject specialists on campus to create contextual information about each collection and present it in a way that will engage users. For a collection of photographs, for example, we may offer biographical information about the photographer, descriptions of the equipment and processes he or she used, and essays discussing the time period and cultural setting in which the photos were taken as well as the significance of the collection.

Once the team has digitized items, created metadata to describe them, and designed the user interface to display them, the collection is almost ready for publication—after the completion of two final steps. First, the technology staff brings together the data and files, which may exist in a variety of formats in many locations, to create the single database that users will see. Then, the new digital collection is placed on a preproduction server, a staging area where library staff can view, test, and experiment with it, looking for any bugs, errors, or unfortunate surprises. Once the team decides that the collection is ready for the world, we move it to the production server and it “goes live.” We announce the new digital collection in many different ways, from official press releases to posts to blogs and social networking sites like Facebook and Twitter, and then watch with pride as users begin to discover and explore the digital collection we’ve built.

Mike, Noah, and Rich

L to R: Michael Adamo is Digital Production Developer; Noah Huffman, Archivist for Metadata and Encoding; and Richard Murray, Metadata Librarian.