Category Archives: Duke Digital Repository

Announcements, Collections, Digital Collections, Duke Digital Repository

Digital Collections 2019

December 15, 2019 Molly Bragg 2 Comments

‘Tis the time of year for top 10 lists. Here at Duke Digital Collections HQ, we cannot just pick 10, because all our digital collections are tops! What follows is a list of all the digital collections we have launched for public access this calendar year.

Our newest collections include a range of formats and subject areas from 19th Century manuscripts to African American soldiers photograph albums to Duke Mens Basketball posters to our first Multispectral Images of papyrus to be ingested into the repository. We also added new content to 4 existing digital collections. Lastly, our platform migration is still ongoing, but we made some incredible progress this year as you will see below. Our goal is to finish the migration by the end of 2020.

New Digital Collections

African American Soldiers Photo Albums (browse all 8 or 1 by 1 using the links below):
Image from the Women’s Army Corps and WAC African American Band scrapbook from Fort Des Moines
Contract with Freedmen on Plains Plantation
Duke Chapel Illuminated Photographs
Duke Men’s Basketball Posters
Duke University Payroll Ledgers, 1927-28 and 1930
Image of the Contract with Freedmen on Plains Plantation
Duke University Woman’s College Handbooks
Harmonic and keyboard designing book
Hayti Elizabeth Street renewal area
Ira Grady scrapbook
Le Pillage du Cap, révolte de Saint-Domingue, 1793
Lois Wright Richardson Davis
Mason Crum Papers
Memory Project
Papyri 1377, both in full spectrum and multispectral imaging
- See highlights of the images or access the full multispectral image stacks
Paul A. Samuelson Economics Cassette Series
Small manuscript collections from the Rubenstein Library, Boxes 20-35 or Charles Butler – James Beauchamp Clark papers
Sir Percy Moleworth Sykes Photograph Album
Woman: the world over

Additions to Existing Collections

Radio Haiti Archive collection landing page

Duke Basketball Films (31 new game films)
Duke Chapel Recordings (800 new recordings)
Duke Chronicle, 1939-40 issues
Radio Haiti Archive

Migrated Collections into the Duke Digital Repository

Kannaopolis film from the H. Lee Waters film collection

Department of African and African American Studies records
H. Lee Waters film collection
Mary Dowdell Ashley film collection
Resource of Outdoor Advertising Descriptions
Sidney Gamble photographs

Behind the Scenes, Duke Digital Repository

A Statement of Commitment

November 11, 2019 Will Sexton 2 Comments

The featured image is from a mockup of a new repositories home page that we’re working on in the Libraries, planned for rollout in January of 2020.

Working at the Libraries, it can be dizzying to think about all of our commitments.

There’s what we owe our patrons, a body of so many distinct and overlapping communities, all seeking to learn and discover, that we could split the library along an infinite number of lines to meet them where they work and think.

There’s what we owe the future, in our efforts to preserve and share the artifacts of knowledge that we acquire on the market, that scholars create on our own campus, or that seem to form from history and find us somehow.

There’s what we owe the field, and the network of peer libraries that serve their own communities, each of them linked in a web of scholarship with our own. Within our professional network, we seek to support and complement one another, to compete sometimes in ways that move our field forward, and to share what we learn from our experiences.

The needs of information technology underlie nearly all of these activities, and to meet those needs, we have an IT staff that’s modest in size, but prodigious in its skill and its dedication to the mission of the Libraries. Within that group, the responsibility for creating new software, and maintaining what we have, falls to a small team of developers and devops engineers. We depend on them to enhance and support a wide range of platforms, including our web services, our discovery platforms, and our digital repositories.

This fall, we did some reflection on how we want to approach support for our repository platforms. The result of that reflection was a Statement of Commitment to Repositories Support and Development, a document of roughly a page that expresses what we consider to be our values in this area, and the context of priorities in which we do that work.

The committee that created the statement was our Digital Preservation and Publishing Program, or DP3 as call it in house. We summarized our values as “openness, community and peer engagement, and independence from vended platforms,” which have “guided us to build our repositories on open source software platforms.” We place that work within the context of very large, looming priorities like our transition to FOLIO as our Library Services Platform, and the project to renovate Lilly Library. There are others, not mentioned in the statement, that fill the pages of this blog.

The statement is explicit that we will not seek to find alternative platforms for our repository services in the next several years, and in particular while the FOLIO transition is underway. This decision is informed by our recognition that migration of content and services across platforms is complex and expensive. It’s also a recognition that we have invested a lot into these existing platforms, and we want to carve out as much space as we can for our talented staff to focus on maintaining and improving them, rather than locking ourselves into all-consuming cycles of content migration.

From a practical perspective, and speaking as the manager who oversees software development in the Libraries, I see this statement as part of an overall strategy to bring focus to our work. It’s a small but important symbolic measure that recognizes the drag that we create for our software team when give in to our urge to prioritize everything.

The phrase “context switching” is one that we have borrowed from the parlance of operating systems to describe the effects on a developer of working on multiple projects at once. There are real costs to moving between development environments, code bases, and architectures on the same day, in the same week, during the same sprint, or within even an extended work cycle. We also call this problem “multi-tasking,” and the penalty it imposes of performance is well documented.

Even more than performance, I think of it as a quality of life concern. People are generally happier and more invested when they’re able to do quality work. As a manager, I can work with scheduling and planning to try to mitigate those effects of multitasking on our team. But the responsibility really lies with the organization. We have our commitments, and they are vast in size and scope. We owe it to ourselves to do some introspection now and again, and ask what we can realistically do with what we have, or more accurately, who we are.

Behind the Scenes, Duke Digital Repository, Technology

What we talk about when we talk about digital preservation

September 22, 2019 Moira Downey

(Header image: Illustration by Jørgen Stamp digitalbevaring.dk CC BY 2.5 Denmark)

Here at Duke University Libraries, we often talk about digital preservation as though everyone is familiar with the various corners and implications of the phrase, but “digital preservation” is, in fact, a large and occasionally mystifying topic. What does it mean to “preserve” a digital resource for the long term? What does “the long term” even mean with regard to digital objects? How are libraries engaging in preserving our digital resources? And what are some of the best ways to ensure that your personal documents will be reusable in the future? While the answers to some of these questions are still emerging, the library can help you begin to think about good strategies for keeping your content available to other users over time by highlighting agreed-upon best practices, as well as some of the services we are able to provide to the Duke community.

File formats

Not all file formats have proven to be equally robust over time! Have you ever tried to open a document created using a Microsoft Office product from several years ago, only to be greeted with a page full of strangely encoded gibberish? Proprietary software like the products in the Office suite can be convenient and produce polished contemporary documents. But software changes, and there is often no guarantee that the beautifully formatted paper you’ve written using Word will be legible without the appropriate software 5 years down the line. One solution to this problem is to always have a version of that software available to you to use. Libraries are beginning to investigate this strategy (often using a technique called emulation) as an important piece of the digital preservation puzzle. The Emulation as a Service (EaaS) architecture is an emerging tool designed to simplify access to preserved digital assets by allowing end users to interact with the original environments running on different emulators.

An alternative to emulation as a solution is to save your files in a format that can be consumed by different, changing versions of software. Experts at cultural heritage institutions like the Library of Congress and the US National Archives and Records Administration have identified an array of file formats about which they feel some degree of confidence that the software of the future will be able to consume. Formats like plain text or PDFs for textual data, value separated files (like comma-separated values, or CSVs), MP3s and MP4s for audio and video data respectively, and JPEGs for still images have all proven to have some measure of durability as formats. What’s more, they will help to make your content or your data more easily accessible to folks who do not have access to particular kinds of software. It can be helpful to keep these format recommendations in mind when working with your own materials.

File format migration

The formats recommended by the LIbrary of Congress and others have been selected not only because they are interoperable with a wide variety of software applications, but also because they have proven to be relatively stable over time, resisting format obsolescence. The process of moving data from an obsolete format to one that is usable in the present day is known as file format migration or format conversion. Libraries generally have yet to establish scalable strategies for extensive migration of obsolete file formats, though it is generally a subject of some concern.

Here at DUL, we encourage the use of one of these recommended formats for content that is submitted to us for preservation, and will even go so far as to convert your files prior to preservation in one of our repository platforms where possible and when appropriate to do so. This helps us ensure that your data will be usable in the future. What we can’t necessarily promise is that, should you give us content in a file format that isn’t one we recommend, a user who is interested in your materials will be able to read or otherwise use your files ten years from now. For some widely used formats, like MP3 and MP4, staff at the Libraries anticipate developing a strategy for migrating our data from this format, in the event that the format becomes superseded. However, the Libraries do not currently have the staff to monitor and convert rarer, and especially proprietary formats to one that is immediately consumable by contemporary software. The best we can promise is that we are able to deliver to the end users of the future the same digital bits you initially gave to us.

Bit-level preservation

Which brings me to a final component of digital preservation: bit-level preservation. At DUL, we calculate a checksum for each of the files we ingest into any of our preservation repositories. Briefly, a checksum is an algorithmically derived alphanumeric hash that is intended to surface errors that may have been introduced to the file during its transmission or storage. A checksum acts somewhat like a digital fingerprint, and is periodically recalculated for each file in the repository environment by the repository software to ensure that nothing has disrupted the bits that compose each individual file. In the event that the re-calculated checksum does not match the one supplied when the file has been ingested into the repository, we can conclude with some level of certainty that something has gone wrong with the file, and it may be necessary to revert to an earlier version of the data. THe process of generating, regenerating, and cross-checking these checksums is a way to ensure the file fixity, or file integrity, of the digital assets that DUL stewards.

Digital Collections, Duke Digital Repository

Resonance of a Moment

September 10, 2019 Shadae Gatlin

Resonance: the reinforcement or prolongation of sound by reflection from a surface or by the synchronous vibration of a neighboring object

(Lexico, 2019)

Nearly 4 months have passed since I moved to Durham from my hometown Chicago to join Duke’s Digital Collections & Curation Services team. With feelings of reflection and nostalgia, I have been thinking on the stories and memories that journeys create.

I have always believed a library the perfect place to discover another’s story. Libraries and digital collections are dynamic storytelling channels that connect people through narrative and memory. What are libraries if not places dedicated to memories? Memory made incarnate in the turn of page, the capturing of an image.

Memory is sensation.

In my mind memory is ethereal – wispy and nebulous. Like trying to grasp mist or fog only to be left with the shimmer of dew on your hands. Until one focuses on a detail, then the vision sharpens. Such as the soothing warmth of a pet’s fur. A trace of familiar perfume in the air as a stranger walks by. Hearing the lilt of an accent from your hometown. That heavy, sticky feeling on a muggy summer day.

Memories are made of moments.

I do not recall the first time I visited a library. However, one day my parents took me to the library and I checked out 11 books on dinosaurs. As a child I was fascinated by them. Due to watching so much of The Land Before Time and Jurassic Park no doubt. One of the books had beautiful full-length pullout diagrams. I remember this.

Experiences tether individuals together across time and place. Place, like the telling of a story is subjective. It holds a finite precision which is absent in the vagueness and vastness of space. This personal aspect is what captures a person when a tale is well told. A corresponding chord is struck, and the story resounds as listeners see themselves reflected.

When a narrative reaches someone with whom it resonates, its impact can be amplified beyond any expectations.

There are many unique memories and moments held in the Duke University Libraries digital collections. Come take a journey and explore a new story.

My humanity is bound up in yours, for we can only be human together. ~Desmond Tutu

Digital Collections, Duke Digital Repository, Projects

Celebrating a New Duke Digital Collections Milestone with Section A

July 12, 2019 Spencer Bevis

Duke Digital Collections recently passed 100,000 items!

Last week, it was brought to our attention that Duke Digital Collections recently passed 100,000 individual items found in the Duke Digital Repository! To celebrate, I want to highlight some of the most recent materials digitized and uploaded from our Section A project. In the past, Bitstreams has blogged about what Section A is and what it means, but it’s been a couple of years since that post, and a little refresher couldn’t hurt.

What is Section A?

In 2016, the staff of Rubenstein Research Services proposed a mass digitization project of Section A. This is the umbrella term for 175 boxes of different historic materials that users often request – manuscripts, correspondence, receipts, diaries, drawings, and more. These boxes contain around 3,900 small collections that all had their own workflows. Every box needs consultations from Rubenstein Research Services, review by Library Conservation Department staff, review by Technical Services, metadata updates, and more, all to make sure that the collections could be launched and hosted within the Duke Digital Repository.

In the 2 years since that blog post, so much has happened! The first 2 Section A collections had gone live as a sort of proof-of-concept, and as a way to define what the digitization project would be and what it would look like. We’ve added over 500 more collections from Section A since then. This somehow barely even scratches the surface of the entire project! We’re digitizing the collections in alphabetical order, and even after all the collections that have gone online, we are currently still only on the letter “C”!

Nonetheless, there is already plenty of materials to check out and enjoy. I was a student of history in college, so in this blog post, I want to particularly highlight some of the historic materials from the latter half of the 19th century.

Showing off some of Section A

Clara Barton’s description of the Grand Hotel de la Paix in Lyon, France.

In 1869, after her work as a nurse in the Civil War, Clara Barton traveled around Europe to Geneva, Switzerland and Corsica, France. Included in the Duke Digital Collections is her diary and calling cards from her time there. These pages detail where she visited and stayed throughout the year. She also wrote about her views on the different European countries, how Americans and Europeans compare, and more. Despite her storied career and her many travels that year, Miss Barton felt that “I have accomplished very little in a year”, and hoped that in 1870, she “may be accounted worthy once more to take my place among the workers of the world, either in my own country or in some other”.

Back in America, around 1900, the Rev. John Malachi Bowden began dictating and documenting his experiences as a Confederate soldier during the Civil War, one of many that a nurse like Miss Barton may have treated. Although Bowden says he was not necessarily a secessionist at the beginning of the Civil War, he joined the 2nd Georgia Regiment in August 1861 after Georgia had seceded. During his time in the regiment, he fought in the Battles of Fredericksburg, Gettysburg, Spotsylvania Court House, and more. In 1864, Union forced captured and held Bowden as a prisoner at Maryland’s Point Lookout Prison, where he describes in great detail what life was like as a POW before his eventual release. He writes that he was “so indignant at being in a Federal prison” that he refused to cut his hair. His hair eventually grew to be shoulder-length, “somewhat like Buffalo Bill’s.”

Speaking of whom, Duke Digital Collections also has some material from Buffalo Bill (William Frederick Cody), courtesy of the Section A initiative. A showman and entertainer who performed in cowboy shows throughout the latter half of the 19th century, Buffalo Bill was enormously popular wherever he went. In this collection, he writes to a Brother Miner about how he invited seventy-five of his “old Brothers” from Bedford, VA to visit him in Roanoke. There is also a brief itinerary of future shows throughout North Carolina and South Carolina. This includes a stop here in Durham, NC a few weeks after Bill wrote this letter.

Buffalo Bill’s letter to his “Brother Miner”, dated October 17, 1916.

Around this time, Walter Clark, associate justice of the North Carolina Supreme Court, began writing his own histories of North Carolina throughout the 18th and 19th centuries. Three of Clark’s articles prepared for the University Magazine of the University of North Carolina have been digitized as part of Section A. This includes an article entitled “North Carolina in War”, where he made note of the Generals from North Carolina engaged in every war up to that point. It’s possible that John Malachi Bowden was once on the battlefield alongside some of these generals mentioned in Clark’s writings. This type of synergy in our collection is what makes Section A so exciting to dive into.

As the new Still Image Digitization Specialist at the Duke Digital Production Center, seeing projects like this take off in such a spectacular way is near and dear to my heart. Even just the four collections I’ve highlighted here have been so informative. We still have so many more Section A boxes to digitize and host online. It’s so exciting to think of what we might find and what we’ll digitize for all the world to see. Our work never stops, so remember to stay updated on Duke Digital Collections to see some of these newly digitized collections as they become available.

Digital Collections, Duke Digital Repository, Technology, User Experience

Web Accessibility: Values and Vigilance

May 10, 2019 Sean Aery

The Duke Libraries are committed to providing outstanding service based on respect and empathy for the diverse backgrounds and needs in our community. Our guiding principles make clear how critically important diversity and inclusion are to the library, and the extent to which we strive to break down barriers to scholarship.

One of the biggest and most important barriers for us to tackle is the accessibility of our web content. Duke University’s Web Accessibility site sums it up well:

Duke believes web content needs to be accessible to people with a wide range of abilities, including visual, auditory, physical, speech, cognitive, language, learning, and neurological abilities.

Screenshot of Duke Web Accessibility homepage — The Duke Web Accessibility website is a tremendous resource for the Duke community.

This belief is also consistent with the core values expressed by the American Library Association (ALA). A library’s website and online resources should be available in formats accessible to people of all ages and abilities.

Web Content

As one of the largest research libraries in the U.S., we have a whole lot of content on the web to consider.

Our website alone comprises over a thousand pages with more than fifty staff contributors. The library catalog interface displays records for over 13 million items at Duke and partner libraries. Our various digital repositories and digital exhibits platforms host hundreds of thousands of interactive digital objects of different types, including images, A/V, documents, datasets, and more. The list goes on.

Any attempt to take a full inventory of the library’s digital content reveals potentially several million web pages under the library’s purview, and all that content is managed and rendered via a dizzying array of technology platforms. We have upwards of a hundred web applications with public-facing interfaces. We built some of these ourselves, some are community-developed (with local customizations), and others we have licensed from vendors. Some interfaces are new, some are old. And some are really old, dating all the way back to the mid-90s.

Ensuring that this content is equally accessible to everyone is important, and it is indeed a significant undertaking. We must also be vigilant to ensure that it stays accessible over time.

With that as our context, I’d like to highlight a few recent efforts in the library to improve the accessibility of our digital resources.

Style Guide With Color Contrast Checks

In January 2019, we launched a new catalog, replacing a decade-old platform and its outdated interface. As we began developing the front-end, we knew we wanted to be consistent, constrained, and intentional in how we styled elements of the interface. We were especially focused on ensuring that any text in the UI had sufficient contrast with its background to be accessible to users with low vision or color-blindness.

We tried out a few existing “living style guide” frameworks. But none of them proved to be a good fit, particularly for color contrast management. So we ended up taking a DIY approach and developed our own living style guide using Javascript and Ruby.

Screenshot of the library catalog style guide showing a color palette. — The library catalog’s living style guide dynamically checks for color contrast accessibility.

Here’s how it works. In our templates we specify the array of color variable names for each category. Then we use client-side Javascript to dynamically measure the hex & RGB values and the luminance of each color in the guide. From those figures, we return score labels for black and white contrast ratios, color-coded for WCAG 2.0 compliance.

This style guide is “living” in that it’s a real-time up-to-date reflection of how elements of the UI will appear when using particular color variable names and CSS classes. It helps to guide developers and other project team members to make good decisions about colors from our palette to stay in compliance with accessibility guidelines.

Audiovisual Captions & Interactive Transcripts

In fall 2017, I wrote about an innovative, custom-developed feature in our Digital Repository that renders interactive caption text for A/V within and below our media player. At that time, however, none of our A/V items making use of that feature were available to the public. In the months since then, we have debuted several captioned items for public access.

We extended these features in 2018, including: 1) exporting captions on-the-fly as Text, PDF, or original WebVTT files, and 2) accommodating transcript files that originated as documents (PDF, Word)

Screenshot of an interactive transcript with export options — WebVTT caption files for A/V are rendered as interactive HTML transcripts and can be exported into text or PDF.

Two of my talented colleagues have shared more about our A/V accessibility efforts at conferences over the past year. Noah Huffman presented at ARCHIVES*RECORDS (Joint Annual Meeting of CoSA, NAGARA, and SAA) in Aug 2018. And Molly Bragg presented at Digital Library Federation (DLF) Forum (slides) in Nov 2018.

Institutional Repository Accessibility

We have documented our work over 2018 revitalizing DSpace at Duke, and then subsequently developing a new set of innovative features that highlight Duke researchers and the impact of their work. This spring, we took a closer look at our new UI’s accessibility following Duke’s helpful guide.

In the course of this assessment, we were able to identify (and then fix!) several accessibility issues in DukeSpace. I’ll share two strategies in particular from the guide that proved to be really effective. I highly recommend using them frequently.

The Keyboard Test

How easy is it to navigate your site using only your keyboard? Can you get where you want to go using TAB, ENTER, SPACE, UP, and DOWN? Is it clear which element of the page current has the focus?

Screenshot of DukeSpace homepage showing skip to content link — A “Skip to main content” feature in DukeSpace improves navigation via keyboard or assistive devices.

This test illuminated several problems. But with a few modest tweaks to our UI markup, we were able to add semantic markers to designate page sections and a skip to main content link, making the content much more navigable for users with keyboards and assistive devices alike.

A Browser Extension

If you’re a developer like me, chances are you already spend a lot of time using your browser’s Developer Tools pane to look under the hood of web pages, reverse-engineer UIs, mess with styles and markup, or troubleshoot problems.

The Deque Systems aXe Chrome Extension (also available for Firefox) integrates seamlessly into existing Dev Tools. It’s a remarkably useful tool to have in your toolset to help quickly find and fix accessibility issues. Its interface is clear and easy to understand. It finds and succinctly describes accessibility problems, and even tells you how to fix them in your code.

An image from the Deque aXe Chrome extension site showing the tool in action.

With aXe testing, we quickly learned we had some major issues to fix. The biggest problems revealed were missing form labels and page landmarks, and low contrast on color pairings. Again, these were not hard to fix since the tool explained what to do, and where.

Turning away from DSpace for a moment, see this example article published on a popular academic journal’s website. Note how it fares with an automated aXe accessibility test (197 violations of various types found). And if you were using a keyboard, you’d have to press Tab over 100 times in order to download a PDF of the article.

Screenshot of aXe Chrome extension running on a journal website. — UI for a published journal article in a publisher’s website after running the aXe accessibility test. Violations found: 197.

Now, let’s look at the open access copy of that same article that resides in our DukeSpace site. With our spring 2019 DukeSpace accessibility revisions in place, when we run an aXe test, we see zero accessibility violations. Our interface is also now easily navigated without a mouse.

Screenshot or DukeSpace UI showing no violations found by aXe accessibility checker — Open access copy of an article in DukeSpace: No accessibility violations found.

Here’s another example of an open access article in DukeSpace vs. its published counterpart in the website of a popular journal (PNAS). While the publisher’s site markup addresses many common accessibility issues, it still shows seven violations in aXe. And perhaps most concerning is that it’s completely unnavigable via a keyboard: the stylesheets have removed all focus styles from displaying.

Concluding Thoughts

Libraries are increasingly becoming champions for open access to scholarly research. The overlap in aims between the open access movement and web accessibility in general is quite striking. It all boils down to removing barriers and making access to information as inclusive as possible.

Our open access repository UIs may never be able to match all the feature-rich bells and whistles present in many academic journal websites. But accessibility, well, that’s right up our alley. We can and should do better. It’s all about being true to our values, collaborating with our community of peers, and being vigilant in prioritizing the work.

Look for many more accessibility improvements throughout many of the library’s digital resources as the year progresses.

Brief explanatory note about the A11Y++ image in this post: A11Y is a numeronym — shorthand for the word “accessibility” and conveniently also visually resembling the word “ally.” The “++” is an increment operator in many programming languages, adding one to a variable.

Behind the Scenes, Collaborations, Duke Digital Repository

It Takes a Village to Curate Your Data: Duke Partners with the Data Curation Network

March 1, 2019 Moira Downey

In early 2017, Duke University Libraries launched a research data curation program designed to help researchers on campus ensure that their data are adequately prepared for both sharing and publication, and long term preservation and re-use. Why the focus on research data? Data generated by scholars in the course of their investigation are increasingly being recognized as outputs similar in importance to the scholarly publications they support. Open data sharing reinforces unfettered intellectual inquiry, fosters transparency, reproducibility and broader analysis, and permits the creation of new data sets when data from multiple sources are combined. For these reasons, a growing number of publishers and funding agencies like PLoS ONE and the National Science Foundation are requiring researchers to make openly available the data underlying the results of their research.

But data sharing can only be successful if the data have been properly documented and described. And they are only useful in the long term if steps have been taken to mitigate the risks of file format obsolescence and bit rot. To address these concerns, Duke’s data curation workflow will review a researcher’s data for appropriate documentation (such as README files or codebooks), solicit and refine Dublin Core metadata about the dataset, and make sure files are named and arranged in a way that facilitates secondary use. Additionally, the curation team can make suggestions about preferred file formats for long-term re-use and conduct a brief review for personally identifiable information. Once the data package has been reviewed, the curation team can then help researchers make their data available in Duke’s own Research Data Repository, where the data can be licensed and assigned a Digital Object Identifier, ensuring persistent access.

“The Data Curation Network (DCN) serves as the “human layer” in the data repository stack and seamlessly connects local data sets to expert data curators via a cross-institutional shared staffing model.”

New to Duke’s curation workflow is the ability to rely on the domain expertise of our colleagues at a few other research institutions. While our data curators here at Duke possess a wealth of knowledge about general research data-related best practices, and are especially well-versed in the vagaries of social sciences data, they may not always have the all the information they need to sufficiently assess the state of a dataset from a researcher. As an answer to this problem, the Data Curation Network, an Alfred P. Sloan Foundation-funded endeavor, has established a cross-institutional staffing model that distributes the domain expertise of each of its partner institutions. Should a curator at one institution encounter data of a kind with which they are unfamiliar, submission to the DCN opens up the possibility for enhanced curation from a network partner with the requisite knowledge.

Duke joins Cornell University, Dryad Digital Repository, Johns Hopkins University, University of Illinois, University of Michigan, University of Minnesota, and Pennsylvania State University in partnering to provide curatorial expertise to the DCN. As of January of this year, the project has moved out of pilot phase into production, and is actively moving data through the network. If a Duke researcher were to submit a dataset our curation team thought would benefit from further examination by a curator with domain knowledge, we will now reach out to the potential depositor to receive clearance to submit the data to the network. We’re very excited about this opportunity to provide this enhancement to our service!

Looking forward, the DCN hopes to expand their offerings to include nation-wide training on specialized data curation and to extend the curation services the network offers beyond the partner institutions to individual end users. Duke looks forward to contributing as the project grows and evolves.

Announcements, Behind the Scenes, Duke Digital Repository, New Collections, Projects

Digital Collections Round Up 2018

December 19, 2018 Molly Bragg

It’s that item of year where we like to reflect on all we have done in 2018, and your favorite digital collections and curation services team is no exception. This year, Digital Collections and Curation Services have been really focusing on getting collections and data into the Digital Repository and making it accessible to the world!

As you will see from the list below we launched 320 new digital collections, managed substantial additions to 2, and migrated 8. However, these publicly accessible digital collections are just the tip of the iceberg in terms of our work in Digital Collections and Curation Services.

A cover from the Ladyslipper, Inc Retail Catalogs digital collection.

So much more digitization happens behind the scenes than is reflected in the list of new digital collections. Many of our larger projects are years in the making. For example, we continue to digitize Gedney photographs and we hope to make them publicly accessible next year. There is also preservation digitization happening that we cannot make publicly accessible online. This work is essential to our preservation mission, though we cannot share the collections widely in the short term.

We strongly believe in keeping our metadata healthy, so in addition to managing new metadata, we often revisit existing metadata across our repositories in order to ensure its overall quality and functionality.

Our team is also responsible for ingesting not just digital collections, but research data and library collections as well. We preserved 20 datasets produced by the Duke Scholarly Community in the Research Data Repository (https://research.repository.duke.edu/) via the Research Data Curation program https://library.duke.edu/data/data-management.

A selection from the Buffalo Bill papers, digitized as part of the Section A project.

New Digital Collections 2018

Darrin Zammit Lupi Photographs: https://repository.duke.edu/dc/lupidarrin
Duke Football Game Film Collection: https://repository.duke.edu/dc/uafootballfilms
Elizabeth Hatcher Conner negatives: https://repository.duke.edu/dc/uaconnereh
Emma Goldman papers: https://repository.duke.edu/dc/goldmanemma
Frank Clyde Brown recordings (coming by Dec. 31): https://repository.duke.edu/dc/brownfrankclyde
George Frederick Holmes letter book: https://repository.duke.edu/dc/holmesgeorge
Iconografica rappresentatione della inclita città di Venezia : consacrata al Reggio serenissimo dominio Veneto: https://repository.duke.edu/dc/maps/ughst001001
Joe H. Hernandez scrapbook: https://repository.duke.edu/dc/hernandezjoeh
Josephine Napoleon Leary papers: https://repository.duke.edu/dc/learyjosephine
Ladyslipper Inc, Retail Catalogs: https://repository.duke.edu/dc/ladyslipper
Reginald Sellman negatives: https://repository.duke.edu/dc/sellmanreginald
Section A – (this is a term for thousands of small mostly Southern mostly 19th Century manuscript collections in the Rubenstein Library):
- Boxes 4-19: Amnesty oaths of Ex-Confederates collection – Anna Burnham papers, 310 collections total

Additions to Existing Digital Collections

Radio Haiti Archive: https://repository.duke.edu/dc/radiohaiti
Duke Chronicle – say hello to the 1990s! https://library.duke.edu/digitalcollections/dukechronicle/

The men’s basketball team celebrates its 1991 championship win.

Collections Migrated into the Digital Repository

American Song sheets: https://repository.duke.edu/dc/songsheets
Asa and Elna Spaulding papers: https://repository.duke.edu/dc/spauldingasaelna
Barnard and Gardner Civil War Photographs: https://repository.duke.edu/dc/barnardgardner
Duke Basketball Video: https://repository.duke.edu/dc/mbball/dbfmv001001
Sam Reed and the Trumpet of Conscience: https://repository.duke.edu/dc/trumpet
SNCC 40th Anniversary Conference Recordings: https://repository.duke.edu/dc/snccanniversarytapes
Doris Duke Photographs: https://repository.duke.edu/dc/dorisdukephotos
Women’s Liberation Movement Print Culture: https://repository.duke.edu/dc/wlmpc

Digital Collections, Duke Digital Repository, Uncategorized

Metadata for Homiletics: Enhancing the Duke Chapel Recordings Digital Collection

September 28, 2018 Maggie Dickson

In 2016, after we launched the first iteration of the Duke Chapel Recordings Digital Collection in the Duke Digital Repository (DDR), we began a collaborative project between Digital Collections and Curation Services, University Archives, and the Duke Divinity School to enhance the metadata. The original metadata was fairly basic and allowed users to identify individual written, audio, and video sermons based on speaker, date, title, and format. All good stuff, but it didn’t allow for discovery based on the intellectual content of the sermons themselves. So, it was decided that, at the same time Divinity School staff listened to and corrected machine-generated transcripts for each sermon, they would also capture information that is useful from a homiletic perspective.

At the very beginning of the project, the Divinity School convened two focus groups of preachers from a variety of denominations and backgrounds to ask them how they would like to be able to discover and use a digital collection of sermons. These groups developed a set of terms/categories based on which they would like to be able to identify sermons. From there I worked with the project team to begin thinking about what kinds of fields they would want to capture, and determine whether or how those fields could map to the existing metadata application profile that we use in the DDR.

It quickly became clear that this project was going to require the creation of new metadata fields in the DDR application. I try to be really judicious about creating new fields (because otherwise, you end up doing this), but in this case, I felt that the need was justified: homiletic metadata is fairly specialized, and given Duke’s commitment to this collecting area, making adjustments to accommodate it seemed more than reasonable. Since I always like to work with best practices, I attempted to identify any extant metadata schemas that might already exist for working with biblical metadata. I felt pretty confident that I would find one, considering that the Bible is actually one of the oldest books out there. While I did find some resources, they were pretty old (think last-updated-in-2006), and all of them were oriented towards marking up actual Biblical texts, rather than the encoding of metadata about those texts.

Screen capture of the OSIS (Open Scripture Information Standard) in the Wayback Machine.

With no established standards to work with, we set about determining what the fields should be, using the practice of homiletics itself as a guide. We also developed a workflow for the capturing of this metadata, using a google spreadsheet with conditional formatting and pre-developed drop-down lists to control and facilitate data entry. And starting from the set of terms/categories developed during the focus groups, we came up with a normalized set of Library of Congress Subject Headings (LCSH) for staff to choose from or add to, as needs arose.

Example of subject headings applied to a sermon, available at https://idn.duke.edu/ark:/87924/r41r6nf5b

Working with LCSH was in itself a challenge, as it required us to navigate the tension between the need to use a standardized set of headings while also include concepts that weren’t themselves well represented in the vocabulary. In some cases we diverged from LCSH in the interest of using terms that would be familiar, expected, and recognizable to practitioners of homiletics. One example of this is the term ‘Community’, which has a particular meaning in a Biblical context, but which, were we to have used the LCSH term ‘Communities’, loses its intent.

We rolled out the new metadata properties and values in early August so they could be available for use by attendees at the international homiletics conference, Societas Homiletica, which was held at Duke University August 3-8, 2018. Now, users of the digital collection can facet and browse by: Liturgical Calendar, Biblical Book, Chapter and Verse, and Subject. We’ve also added curated abstracts, and key quotations from the sermons, which are free-text searchable.

The enhanced metadata makes for a much more meaningful experience using the Duke Chapel Recordings, and future plans involve the inclusion of sermon transcripts, as well as the development of a complimentary website, maintained by the Duke Divinity School, to provide even more information about the speakers and their sermons. With these enrichments, we are well on our way to having an unparalleled free and open resource for the study of homiletics, and hopefully, in so doing, we will facilitate the discovery and study of preachers whose voices have traditionally been underheard.

Duke Digital Repository

DDR-RD: Previewing DUL’s new platform for research data

July 20, 2018 Will Sexton

While we sometimes talk about “the repository” as if it were a monolith at Duke University Libraries, we have in fact developed and maintained two core platforms that function as repository applications. I’ll describe them briefly, then preview a third that is in development, as well as the rationale behind expanding in this way.

Continue reading DDR-RD: Previewing DUL’s new platform for research data →