Tag Archives: digital collections

2.5 Years in the Life of Digital Collections

Admit it, you have been wondering what your favorite digital collections team has been up to. Well after 2.5 years, the wait is over.

So. Many. Digitization. Requests.

When I last shared a digital collections update, it was the end of 2020, and the digital collections team was focusing on managing and refining our folder level patron request digitization workflow. This workflow has two main goals:

  • simplify the patron request process in the Rubenstein Library;
  • preserve and make accessible files from patron requests in the Duke Digital Repository (DDR).

Note that in this context, our patrons are generally folks that want to access Rubenstein Library materials without making the trip to Durham. Anyone, regardless of their researcher or academic status, can request digital copies of Rubenstein collections.

Moving digitization requests through this workflow continues to be the major focus for the digital collections team and the Digital Production Center (DPC). Given the folder level nature of the process (whole folders of manuscript material at preservation quality), more requests are digitized by the DPC than under our previous workflow. Additionally, the new request process became an essential tool to serving remote researchers during the pandemic. It continues to be a valuable service, and we have not seen demand lessen significantly since the peak of the pandemic. Below is a chart showing the number of patron requests managed by the DPC since before the pandemic (note that we track our statistics by fiscal year or FY, which in Duke’s case is July – June).

FY18 FY19 FY20 FY21 FY22 FY23
# of requests 81 77 39 394 438 469
Files produced 1,323 676 1092 79,519 74,517 73,705

Patron requests received and files produced from said requests by the DPC.

As a result of the new patron request workflow, the digital collections team has made portions of hundreds of collections accessible in the digital repository. We also see new materials from the existing collections requested periodically, so individual digital collections grow over time. Our statistics for new digital collections are in the chart below.

FY21 FY22 FY23
New digital collections from patron requests 36 154 119
Additions to existing collections from patron requests 4 12 6
Print items digitized for patron requests 16 21 30
Non-patron based new digital collections 131 15 80
Additions to digital collections (not patron request oriented) 4 4 5

Numbers of collections launched in the Duke Digital Repository since 2020.

The patron request workflow, like all other digital collections projects, is carried out by the cross-departmental Duke Libraries Digital Collections Implementation Team (DCIT). DCIT members include representatives from Conservation Services, Digital Curation Services, the Digital Production Center, a Digital Projects Developer (from the Assessment and User Experience Strategy department), Rubenstein Library Research Services, and Rubenstein Library Technical Services. The group’s membership shows how varied the needs are to develop and sustain digital collections. 

Not Just Patron Requests

Although the digital collections team shelved strategic projects when the pandemic began, we have still managed to complete some project work. One of our highest priorities in this area has been the Documenting African American Life in the Jim Crow South: Digital Access to the Behind the Veil Project Archive project (funded by the National Endowment for the Humanities). Work on this project was just featured on Bitstreams, so I won’t share too many details here. Stay tuned for more news about this incredible effort.

We have also been making slow progress on the “Section A” mass digitization project. This project is named for an old Rubenstein Library shelving location, and contains over 3000 small manuscript collections.  Many of the collections document life in the South in the 19th Century. Since 2020, we have been able to make 210 Section A collections accessible online. Many of these were scanned before the pandemic began, however the DPC continues to scan Section A when time permits. We have also seen at least 25 Section A collections come all the way through the patron request workflow, and there are more in progress.  I’ve included embedded links to 3 Section A collections below.

 

 

Here are a few other project highlights from the past 2.5 years.

Deed of manumission freeing Sue, an enslaved woman, and her daughter Margaret, Georgetown, South Carolina, 1815 October 6
Deed of Manumission from the American Slavery Digital collection
Migrants and refugees walk to a waiting bus after arriving on a rubber dinghy on a beach on the Greek island of Lesbos, January 29, 2016.
Image from the Darrin Zammit Lupi digital collection

Looking ahead

Digital Collections has a lot to look forward to in 2023-2024. Along with the John Hope Franklin Research Center we expect to wrap up the Behind the Veil grant in 2024 (lots more news to come on that). The digital collections team also plans to continue refining the patron request workflow. We are hoping to find a new balance in our portfolio that allows us to continue serving the needs of remote researchers while also completing more project based digitization. How will we actually do that without significantly changing our staffing? When we figure it out, we will be happy to share.

In the meantime, all digital collections are available through the Duke Digital Repository.
Happy Browsing!

 

Two Years In: The Finish Line Approaches for Digitizing Behind the Veil

Behind the Veil Digitization intern Sarah Waugh and Digital Collections intern Kristina Zapfe’s efforts over the past year have focused on quality control of interviews transcribed by Rev.com. This post was authored by Sarah Waugh and Kristina Zapfe.

Introduction

The Digital Production Center (DPC) is proud to announce that we have reached a milestone in our work on Documenting African American Life in the Jim Crow South: Digital Access to the Behind the Veil Project Archive. We have completed digitization and are over halfway through our quality control of the audio transcripts! The project, funded by the National Endowment for the Humanities, will expand the Behind the Veil (BTV) digital collection, currently 410 audio files, to include the newly digitized copies of the original master recordings, photographic materials, and supplementary project files.

The collection derives from Behind the Veil: Documenting African-American Life in the Jim Crow South. This was an oral history project headed by Duke University’s Center for Documentary Studies from 1993 to 1995 and is currently housed in the David M. Rubenstein Rare Book and Manuscript Library and curated by the John Hope Franklin Research Center for African and African American History and Culture. The BTV collection documented and preserved the memory of African Americans who lived in the South from the 1890s to the 1950s, resulting in a culturally-significant and extensive multimedia collection. 

As interns, our work focused on ordering transcripts from Rev.com and performing quality control on transcripts for the digitized oral histories. July 2023 marked our arrival at the halfway point of completing the oral history transcript quality control process. At the time of writing, we’ve checked 1727 of 2876 files after a year of initial planning and hard work. With over 1,666 hours worth of audio files to complete, 3 interns and 7 student workers in the DPC contributed 849 combined hours to oral history transcript quality control so far. Because of their scope, transcription and quality control are the last pieces of the digitization puzzle before the collection moves on to be ingested and published in the Duke Digital Repository

We are approaching the home stretch with the deadline for transcript quality control coming in December 2023, and the collection scheduled to launch in 2024. With that goal approaching, here is what we’ve completed and what remains to be done.

Digitization Progress

A graphic showing the statistics of the Behind the Veil project alongside a tablet device of the Duke Libraries Behind the Veil webpage. Under Audio, the text reads: 1,577 tapes, 2.876 audio files, and 2,709 files transcribed. Under the word Admin Files, it reads: 27,737 image files, under the Prints and Negatives heading it reads, 1,294 image files. On the right side of the graphic, headings for Video, Project Records, and Photo Slides are here and under Video it reads 14 tapes, 14 video files, under Project Records, it reads 9,328 image files, and under Photo Slides, it reads 2,920 image files.

As the graphic above indicates, the BTV digitization project consists of many different media like audio, video, prints, negatives, slides, administrative and project related documents that tell a fuller story of this endeavor. With these formats digitized, we look forward to finishing quality control and preparing the files for handoff to members of the Digital Collections and Curation Services department for ingest, metadata application, and launch for public access in 2024. We plan to send all 2876 audio files to Rev.com service by the end of August and to perform quality control on all those transcripts by December 2023.

Developing the Transcription Quality Control Process

With 2876 files to check within 19 months, the cross-departmental BTV team developed a process to perform quality control as efficiently as possible without sacrificing accuracy, accessibility, and our commitment to our stakeholders. We made our decisions based on how we thought BTV interviewers and narrators would want their speech represented as text. Our choices in creating our quality control workflow began with Columbia University’s Oral History Transcription Style Guide and from that resource, we developed a workflow that made sense for our team and the project. 

Some voices were difficult to transcribe due to issues with the original recording, such as a microphone being placed too far away from a speaker, the interference of background noise, or mistakes with the tape. Since we did not have the resources to listen to entire interviews and check for every single mistake, we developed what we called the “spot-check” process of checking these interviews. Given the BTV project’s original ethos and the history of marginalized people in archives, the team decided to prioritize making sure race-related language met our standards across every single interview.

A few decisions on standards were quick and unanimous—such as not transcribing speech phonetically. With that, we avoided pitfalls from older oral histories of African Americans, like the WPA’s famous “Slave Narratives” project, that interviewed formerly-enslaved people, but often transcribed their words in non-standard phonetic spellings. Some narrators in the BTV project who may have been familiar with the WPA transcripts specifically requested the BTV project team not to use phonetic spelling. 

Other choices took more discussion: we agreed on capitalizing “Black” when describing race, but we had to decide whether to capitalize other racial terms, including “White” and antiquated designations like “Colored.” Ultimately, we decided to capitalize all racial terms (with the exception of slurs). The team did not want users to make distinctions between lower and uppercase terms if we did not choose to capitalize them all. Maintaining consistency with capitalization would provide clarity and align with BTV values of equality between all races.

Using a spot-check process where we use Rev’s find-and-replace feature to standardize our top priorities saved us time to improve the transcripts in other ways. For instance, we also try to find and correct proper nouns like street names or names of important people in our narrators’ communities, allowing users to make connections in their research. We corrected mistakes with phrases used mainly in the past or that are very specific to certain regions, such as calling a dance hall a “Piccolo joint” from an early jukebox brand name. We also listened to instances where the transcriptionist could not hear or understand a phrase and marked it as “indistinct,” so we can add in the dialogue later (assuming we are able to decipher what was said). 

While we developed these methods to increase the pace of our quality control process, one of the biggest improvements came from working with Rev. If we were able to attain more accurate transcripts, our quality control process would be more efficient. Luckily, Rev’s suite of services provided us this option without straying too far from our transcription budget.

Improving Accuracy with Southern Accents Specialists

When deciding on what would be the best speech-to-text option for our project’s needs, we elected to order Transcript Services from Rev, rather than their Caption Services. This decision hinged on the fact that the Transcript Services option is their only service that allows us to request Rev transcriptionists who specialize in Southern accents. Many people who were interviewed for Behind the Veil spoke with Southern accents that varied in strength and dialect. We found that the Southern accent expertise of the specialists had a significant impact on the accuracy of the transcripts we received from Rev. 

This improvement in transcript quality has made a substantial difference in the time we spend on quality control for each interview: on average, it only takes us about 48 seconds of work for every 60 seconds of audio we check. We appreciated Rev’s offering of Southern accent specialists enough that we chose that service, even though it meant that we had to then convert their text file format output to the WebVTT file format for enhanced accessibility in the Duke Digital Repository.   

Optimizing Accessibility with WebVTT File Format

The WebVTT file format provides visual tracking that coordinates the audio with the written transcript. This improvement in user experience and accessibility justified converting the interview transcripts to WebVTT format. Below is a visual of the WebVTT format in our existing BTV collection in the DDR. Click here to listen to the audio recording.

We have been collaborating with developer Sean Aery to convert transcript text files to WebVTT files so they will display properly in the Duke Digital Repository. He explained the conversion process that occurs after we hand off the transcripts in text file format.

“The .txt transcripts we received from the vendor are primarily formatted to be easy for people to read. However, they are structured well enough to be machine-readable as well. I created a script to batch-convert the files into standard WebVTT captions with long text cues. In WebVTT form, the caption files play nicely with our existing audiovisual features in the Duke Digital Repository, including an interactive transcript viewer, and PDF exports.”  Sean Aery, Digital Projects Developer, Duke University Libraries

Before conversion, we complete one more round of quality control using the spot-checking process. We have even referred to other components of the Behind the Veil collection (Administrative and Project Files Administrative Files) to cross-reference any alterations to metadata for accuracy.

Connecting the Local and Larger Community

Throughout the project, team members have been working on outreach. One big accomplishment by project PI John Gartrell and former BTV outreach intern Brianna McGruder was Behind the Veil at 30: Reflections on Chronicling African American Life in the Jim Crow South.” This 2-day virtual conference convened former BTV interviewers and current scholars of the BTV collection to discuss their work and the impact that this collection had on their research. 

We also recently presented at the Triangle Research Libraries Network annual meeting, where our presentation overlapped with some of what you’ve just read in this post. It was exciting to share our work publicly for the first time and answer questions from library staff across the region. We will also be presenting a poster about our BTV experience at the upcoming North Carolina Library Association conference in Winston-Salem in October.

A image of two people standing a podium with a screen behind them. Four people in the front row look out at them.
Sarah Waugh and Kristina Zapfe presenting at the 2023 TRLN Annual Conference.

As we’ve hoped to convey, this project heavily relies on collaboration from many library departments and external vendors, and there are more contributors than we can thoroughly include in this post. Behind the Veil is a large-scale and high-profile project that has impacted many people over its 30-year history, and this newest iteration of digital accessibility seeks to expand the reach of this collection. Two years on, we’ve built on the work of the many professionals who have come before us to create and develop Behind the Veil. We are honored to be part of this rewarding process. Look for more BTV stories when we cross the finish line in 2024. 

Casting a Critical Eye on the Hayti-Elizabeth Street Renewal Area Maps

In 2019, one of the digital collections we made available to the public was a small set of architectural maps and plans titled the ‘Hayti-Elizabeth Street Renewal Area’. The short description of the maps indicates they ‘depict existing and proposed structures and modifications to the Hayti neighborhood in Durham, NC.’ Sounds pretty benign, right? Perhaps even kind of hopeful, given the word ‘renewal’?

Hayti-Elizabeth Street Renewal Area, Existing Land Use Map

Nope. This anodyne description does not tell the story of the harm caused by the Durham Urban Renewal project of the 1960s and 1970s. The Durham Redevelopment Commission intended to eliminate ‘urban blight’ via this project, which ultimately resulted in the destruction of more than 4,000 households and 500 businesses in predominantly African American areas of the city. The Hayti District, once a flourishing and self-sufficient neighborhood filled with Black-owned businesses, was largely demolished, divided, and effectively severed from what is now downtown Durham by the construction of NC Highway 147. 

Bull City 150, a “public history, geography and community engagement project” based here at Duke University, hosts a suite of excellent multi-media public history exhibitions about housing inequality in Durham on its website. One of these is Dismantling Hayti, which focuses in particular on the effects of urban renewal on the neighborhood and the city.

Dismantling Hayti, Bull City 150

But this story of so-called urban renewal is not just about Durham – it’s about the United States as a whole. From the 1950s to the 1980s, municipalities across the country demolished roughly 7.5 million dwelling units, with a vastly disproportionate impact on Black and low-income neighborhoods, in the name of revitalization. Bulldozing for highway corridors was frequently a part of urban renewal projects, happening in San Francisco, Memphis, Boston, Atlanta, Syracuse, Baltimore, everywhere in the country – the list goes on and on. And it includes Saint Paul, Minnesota, the city where, mourning and protesting the killing of yet another Black person at the hands of a white police officer, thousands of people occupied Interstate 94 in recent weeks, marching from the state capitol to Minneapolis, over a highway that was once the African American neighborhood of Rondo.  

Urban renewal projects led to what social psychiatrist Dr. Mindy Fulilove refers to as root shock – “a traumatic stress reaction related to the destruction of one’s emotional ecosystem”. This is but one thread in the fabric of white supremacy out of which our country was woven, among other twentieth century practices of redlining, discriminatory mortgage lending practices, denial of access to unemployment benefits, and rampant Jim Crow laws, which are still causing harm today. This is why it is important to interrogate the historical context of resources like the Hayti-Elizabeth Street Renewal Area maps – we should all accept the invitation extended on the Bull City 150 website to Durhamites to “reckon with the racial and economic injustices of the past 150 years and commit to building a more equitable future”.

Beyond One Thousand Words

There is a particular fondness that I hold for digital photograph collections. If I had to pinpoint when this began, then I would have to say it started while digitizing material on a simple Epson flatbed scanner as an undergraduate student worker in the archives.

Witnessing the physical become digital is a wonder that never gets old.

Every day we are generating digital content. Pet pics. Food pics. Selfies. Gradually building a collection of experiences as we document our lives in images. Sporadic born digital collections stored on devices and in the cloud.

I do not remember the last time I printed a photograph.

My parents have photo albums that I love. Seeing images of them, then us. The tacky adhesive and the crinkle of thin plastic film as it is pulled back to lift out a photo. That perfect square imprint left behind from where the photo rested on the page.

Pretty sure that Polaroid camera is still around somewhere.

Time bound up in a book.

Beyond their visual appeal, I appreciate how photos capture time. Nine months have passed since I moved to North Carolina. I started 2019 in Chicago and ended it in Durham. These photos of my Winter in both places illustrate that change well.

Sometimes I want to pull down my photos from the cloud and just print everything. Make my own album. Have something with heft and weight to share and say, “Hey, hold and look at this.” That sensory experience is invaluable.

Yet, I also value the convenience of being able to view hundreds of photos with the touch of a button.

Duke University Libraries offers access to thousands of images through its Digital Collections.

Here’s a couple photo collections to get you started:

Resonance of a Moment

Resonance: the reinforcement or prolongation of sound by reflection from a surface or by the synchronous vibration of a neighboring object

(Lexico, 2019)

Nearly 4 months have passed since I moved to Durham from my hometown Chicago to join Duke’s Digital Collections & Curation Services team. With feelings of reflection and nostalgia, I have been thinking on the stories and memories that journeys create.

I have always believed a library the perfect place to discover another’s story. Libraries and digital collections are dynamic storytelling channels that connect people through narrative and memory. What are libraries if not places dedicated to memories? Memory made incarnate in the turn of page, the capturing of an image.

Memory is sensation.

In my mind memory is ethereal – wispy and nebulous. Like trying to grasp mist or fog only to be left with the shimmer of dew on your hands. Until one focuses on a detail, then the vision sharpens. Such as the soothing warmth of a pet’s fur. A trace of familiar perfume in the air as a stranger walks by. Hearing the lilt of an accent from your hometown. That heavy, sticky feeling on a muggy summer day.

Memories are made of moments.

I do not recall the first time I visited a library. However, one day my parents took me to the library and I checked out 11 books on dinosaurs. As a child I was fascinated by them. Due to watching so much of The Land Before Time and Jurassic Park no doubt. One of the books had beautiful full-length pullout diagrams. I remember this.

Experiences tether individuals together across time and place. Place, like the telling of a story is subjective. It holds a finite precision which is absent in the vagueness and vastness of space. This personal aspect is what captures a person when a tale is well told.  A corresponding chord is struck, and the story resounds as listeners see themselves reflected.

When a narrative reaches someone with whom it resonates, its impact can be amplified beyond any expectations.

There are many unique memories and moments held in the Duke University Libraries digital collections. Come take a journey and explore a new story.

My humanity is bound up in yours, for we can only be human together. ~Desmond Tutu

Celebrating a New Duke Digital Collections Milestone with Section A

Duke Digital Collections recently passed 100,000 items!

 

Last week, it was brought to our attention that Duke Digital Collections recently passed 100,000 individual items found in the Duke Digital Repository! To celebrate, I want to highlight some of the most recent materials digitized and uploaded from our Section A project. In the past, Bitstreams has blogged about what Section A is and what it means, but it’s been a couple of years since that post, and a little refresher couldn’t hurt.

What is Section A?

In 2016, the staff of Rubenstein Research Services proposed a mass digitization project of Section A. This is the umbrella term for 175 boxes of different historic materials that users often request – manuscripts, correspondence, receipts, diaries, drawings, and more. These boxes contain around 3,900 small collections that all had their own workflows. Every box needs consultations from Rubenstein Research Services, review by Library Conservation Department staff, review by Technical Services, metadata updates, and more, all to make sure that the collections could be launched and hosted within the Duke Digital Repository. 

In the 2 years since that blog post, so much has happened! The first 2 Section A collections had gone live as a sort of proof-of-concept, and as a way to define what the digitization project would be and what it would look like. We’ve added over 500 more collections from Section A since then. This somehow barely even scratches the surface of the entire project! We’re digitizing the collections in alphabetical order, and even after all the collections that have gone online, we are currently still only on the letter “C”! 

Nonetheless, there is already plenty of materials to check out and enjoy. I was a student of history in college, so in this blog post, I want to particularly highlight some of the historic materials from the latter half of the 19th century.

Showing off some of Section A

Clara Barton’s description of the Grand Hotel de la Paix in Lyon, France.

In 1869, after her work as a nurse in the Civil War, Clara Barton traveled around Europe to Geneva, Switzerland and Corsica, France. Included in the Duke Digital Collections is her diary and calling cards from her time there. These pages detail where she visited and stayed throughout the year. She also wrote about her views on the different European countries, how Americans and Europeans compare, and more. Despite her storied career and her many travels that year, Miss Barton felt that “I have accomplished very little in a year”, and hoped that in 1870, she “may be accounted worthy once more to take my place among the workers of the world, either in my own country or in some other”.

Back in America, around 1900, the Rev. John Malachi Bowden began dictating and documenting his experiences as a Confederate soldier during the Civil War, one of many that a nurse like Miss Barton may have treated. Although Bowden says he was not necessarily a secessionist at the beginning of the Civil War, he joined the 2nd Georgia Regiment in August 1861 after Georgia had seceded. During his time in the regiment, he fought in the Battles of Fredericksburg, Gettysburg, Spotsylvania Court House, and more. In 1864, Union forced captured and held Bowden as a prisoner at Maryland’s Point Lookout Prison, where he describes in great detail what life was like as a POW before his eventual release. He writes that he was “so indignant at being in a Federal prison” that he refused to cut his hair. His hair eventually grew to be shoulder-length, “somewhat like Buffalo Bill’s.”

Speaking of whom, Duke Digital Collections also has some material from Buffalo Bill (William Frederick Cody), courtesy of the Section A initiative. A showman and entertainer who performed in cowboy shows throughout the latter half of the 19th century, Buffalo Bill was enormously popular wherever he went. In this collection, he writes to a Brother Miner about how he invited seventy-five of his “old Brothers” from Bedford, VA to visit him in Roanoke. There is also a brief itinerary of future shows throughout North Carolina and South Carolina. This includes a stop here in Durham, NC a few weeks after Bill wrote this letter.

Buffalo Bill’s letter to his “Brother Miner”, dated October 17, 1916.

Around this time, Walter Clark, associate justice of the North Carolina Supreme Court, began writing his own histories of North Carolina throughout the 18th and 19th centuries. Three of Clark’s articles prepared for the University Magazine of the University of North Carolina have been digitized as part of Section A. This includes an article entitled “North Carolina in War”, where he made note of the Generals from North Carolina engaged in every war up to that point. It’s possible that John Malachi Bowden was once on the battlefield alongside some of these generals mentioned in Clark’s writings. This type of synergy in our collection is what makes Section A so exciting to dive into.

As the new Still Image Digitization Specialist at the Duke Digital Production Center, seeing projects like this take off in such a spectacular way is near and dear to my heart. Even just the four collections I’ve highlighted here have been so informative. We still have so many more Section A boxes to digitize and host online. It’s so exciting to think of what we might find and what we’ll digitize for all the world to see. Our work never stops, so remember to stay updated on Duke Digital Collections to see some of these newly digitized collections as they become available. 

The ABCs of Digitizing Section A

I’m not sure anyone who currently works in the library has any idea when the phrase “Section A” was first coined as a call number for small manuscript collections. Before the library’s renovation, before we barcoded all our books and boxes — back when the Rubenstein was still RBMSCL, and our reading room carpet was a very bright blue — there was a range of boxes holding single-folder manuscript collections, arranged alphabetically by collection creator. And this range was called Section A.

Box 175 of Section A
Box 175 of Section A

Presumably there used to be a Section B, Section C, and so on — and it could be that the old shelf ranges were tracked this way, I’m not sure — but the only one that has persisted through all our subsequent stacks moves and barcoding projects has been Section A. Today there are about 3900 small collections held in 175 boxes that make up the Section A call number. We continue to add new single-folder collections to this call number, although thanks to the miracle of barcodes in the catalog, we no longer have to shift files to keep things in perfect alphabetical order. The collections themselves have no relationship to one another except that they are all small. Each collection has a distinct provenance, and the range of topics and time periods is enormous — we have everything from the 17th to the 21st century filed in Section A boxes. Small manuscript collections can also contain a variety of formats: correspondence, writings, receipts, diaries or other volumes, accounts, some photographs, drawings, printed ephemera, and so on. The bang-for-your-buck ratio is pretty high in Section A: though small, the collections tend to be well-described, meaning that there are regular reproduction and reference requests. Section A is used so often that in 2016, Rubenstein Research Services staff approached Digital Collections to propose a mass digitization project, re-purposing the existing catalog description into digital collections within our repository. This will allow remote researchers to browse all the collections easily, and also reduce repetitive reproduction requests.

This project has been met with enthusiasm and trepidation from staff since last summer, when we began to develop a cross-departmental plan to appraise, enhance description, and digitize the 3900 small manuscript collections that are housed in Section A. It took us a bit of time, partially due to the migration and other pressing IT priorities, but this month we are celebrating a major milestone: we have finally launched our first 2 Section A collections, meant to serve as a proof of concept, as well as a chance for us to firmly define the project’s goals and scope. Check them out: Abolitionist Speech, approximately 1850, and the A. Brouseau and Co. Records, 1864-1866. (Appropriately, we started by digitizing the collections that began with the letter A.)

A. Brouseau & Co. Records carpet receipts, 1865

Why has it been so complicated? First, the sheer number of collections is daunting; while there are plenty of digital collections with huge item counts already in the repository, they tend to come from a single or a few archival collections. Each newly-digitized Section A collection will be a new collection in the repository, which has significant workflow repercussions for the Digital Collections team. There is no unifying thread for Section A collections, so we are not able to apply metadata in batch like we would normally do for outdoor advertising or women’s diaries. Rubenstein Research Services and Library Conservation Department staff have been going box by box through the collections (there are about 25 collections per box) to identify out-of-scope collections (typically reference material, not primary sources), preservation concerns, and copyright concerns. These are excluded from the digitization process. Technical Services staff are also reviewing and editing the Section A collections’ description. This project has led to our enhancing some of our oldest catalog records — updating titles, adding subject or name access, and upgrading the records to RDA, a relatively new standard. Using scripts and batch processes (details on GitHub), the refreshed MARC records are converted to EAD files for each collection, and the digitized folder is linked through ArchivesSpace, our collection management system. We crosswalk the catalog’s name and subject access data to both the finding aid and the repository’s metadata fields, allowing the collection to be discoverable through the Rubenstein finding aid portal, the Duke Libraries catalog, and the Duke Digital Repository.

It has been really exciting to see the first two collections go live, and there are many more already digitized and just waiting in the wings for us to automate some of our linking and publishing processes. Another future development that we expect will speed up the project is a batch ingest feature for collections entering the repository. With over 3000 collections to ingest, we are eager to streamline our processes and make things as efficient as possible. Stay tuned here for more updates on the Section A project, and keep an eye on Digital Collections if you’d like to explore some of these newly-digitized collections.