Category Archives: Projects

New Duke Libraries catalog to go live January 16

The wait is almost over!  After over two years of hard work by library staff at Duke, NCCU, NCSU, and UNC, we’ll be launching a new catalog for researchers to use to locate and access books, DVDs, music, archival materials, and other items held here at Duke and across the other Triangle Research Libraries member libraries (NCSU, NCCU, UNC).  The new collaboratively developed, open-source library catalog will replace the decade-old Duke Libraries catalog and Search TRLN catalog.    

What to expect from the new catalog

While the basic functionality of the catalog remains the same – researchers will be able to search across the 15 million+ items at Duke and area libraries – we think you’ll enjoy some nice enhancements and updates to the catalog:  

  • a more modern search interface
  • prominent filters to modify search results, including an option to narrow your search to items available online
  • a more intuitive button to expand search results to include items held at UNC, NCCU, and NCSU
  • updated item pages with easier access to request and export materials
  • improved access to an item’s availability status and its location  
  • more user-friendly options to email, text, or export citation information
  • improved display of multi-part items (click “View Online” to access individual episodes)
  • more robust Advanced Search with options to search by Publisher and Resource type (e.g., book, video, archival material)
Screenshot of full item page.
Item pages have been updated and include a sidebar with easy access to Request and Email/Text options.

You might notice some differences in the way the new catalog works.  Learn more with this handy new vs. old catalog comparison chart.  (Note that we plan to implement some of the features that are not currently available in the new catalog this spring – stay tuned for more info, and let us know if there are aspects of the old catalog that you miss.)  And if you run into trouble or have more questions about using the new catalog, check out these library catalog search tips, or contact a librarian for assistance.   

Screenshot of Search Tips webpage
Get tips for using the new catalog.

We welcome your feedback

While the new catalog is fully functional and boasts a number of enhancements, we know there is still work to be done, and we welcome your input.  We invite you to explore the new catalog at https://find.library.duke.edu/ and report problems or provide general feedback through this online form.  We’ll continue to conduct user testing this spring and make improvements based on what we learn – look for the “Free Coffee” sign in the Perkins Library lobby, and stop by to tell us what you think.  

Want more info about this project?  Learn about the vision for developing the new catalog and the work that’s been completed to date.

Digital Collections Round Up 2018

It’s that item of year where we like to reflect on all we have done in 2018, and your favorite digital collections and curation services team is no exception.  This year, Digital Collections and Curation Services have been really focusing on getting collections and data into the Digital Repository and making it accessible to the world!

As you will see from the list below we launched 320 new digital collections, managed substantial additions to 2, and migrated 8. However, these publicly accessible digital collections are just the tip of the iceberg in terms of our work in Digital Collections and Curation Services.

A cover from the Ladyslipper, Inc Retail Catalogs digital collection.

So much more digitization happens behind the scenes than is reflected in the list of new digital collections.  Many of our larger projects are years in the making. For example, we continue to digitize Gedney photographs and we hope to make them publicly accessible next year.  There is also preservation digitization happening that we cannot make publicly accessible online.  This work is essential to our preservation mission, though we cannot share the collections widely in the short term.

We strongly believe in keeping our metadata healthy, so in addition to managing new metadata, we often revisit existing metadata across our repositories in order to ensure its overall quality and functionality.

Our team is also responsible for ingesting not just digital collections, but research data and library collections as well.  We preserved 20 datasets produced by the Duke Scholarly Community in the Research Data Repository (https://research.repository.duke.edu/)  via the Research Data Curation program https://library.duke.edu/data/data-management.

A selection from the Buffalo Bill papers, digitized as part of the Section A project.

New Digital Collections 2018

Additions to Existing Digital Collections

The men’s basketball team celebrates its 1991 championship win.

Collections Migrated into the Digital Repository

A collaborative approach to developing a new Duke Libraries catalog

Post contributed by: Emily Daly, Thomas Crichlow, and Cory Lown

If you’re a frequent or even casual user of the Duke Libraries catalog, you’ve probably noticed that it’s remained remarkably consistent over the last decade. Consistency can be a good thing, but there is certainly room for improvement in the Duke Libraries catalog, and staff from the libraries at Duke, UNC, and NCSU are excited to replace the current catalog’s aging infrastructure and outdated user interface with an entirely new collaboratively developed open-source discovery layer. While many things are changing, one key feature will remain the same: The catalog will continue to allow users to locate and access materials not only here at Duke but also across the other Triangle Research Libraries member libraries (NCSU, NCCU, UNC).

Users will be able to search for items in the Duke Libraries catalog and then expand to see books and items from NCSU, NCCU, and UNC if they wish.

Commitment to collaboration

In addition to an entirely new central index that supports institutional and consortial searching, the new catalog benefits from a shared, centrally developed codebase as well as locally hosted, customizable catalog interfaces. Perhaps most notably, the new catalog has been built with the needs of library and complex bibliographic data in mind. While the software used for the current library catalog has evolved and grown in complexity to support e-commerce and business needs (not higher ed or library needs), the library software development community has been hard at work building specialized discovery layers using the open-source Blacklight framework. Peer institutions including Stanford, Cornell, and Princeton are already using Blacklight for their library catalogs, and there is an active Blacklight development community that Duke is excited to be a part of. Being part of this community enables us to build on the good work already in place in other library catalogs, including more intuitive facets, adaptive linking for subjects and other fields, a more responsive user interface for access via tablets and phones, and the ability to preserve the order of MARC fields when it’s useful to researchers (MARC is an international standard for representing bibliographic and related data).

We’re upping our collaboration game locally, too: This project has given us the opportunity to develop a new model for collaborative software development. Rather than reinvent the wheel at each Triangle Research Library, we’re combining effort and expertise to develop a feature-rich yet highly customizable discovery layer that will serve the needs of researchers across the triangle. To do this, we have adopted an agile project management process with talented developers and dedicated product owners from NCSU, UNC, and Duke. The agile approach has helped us be more productive and efficient during the development phase and increased collaboration across the four Triangle Research Libraries, positioning us well for maintaining and governing the catalog after we go live.

This image depicts the structure of the development team that was formed in May 2017 to collaboratively build the new library catalog.

What’s next?

The development team has already conducted multiple rounds of user testing and made changes to the user interface based on findings. We’re now ready to hear feedback from library staff. To facilitate this, we’ll be launching the Duke instance of the catalog to all library staff next Wednesday, August 1. We encourage staff to explore catalog features and records and then report feedback, providing screenshots, URLs, and other details as needed. We’ll continue user testing this fall and solicit extensive feedback from faculty, students, staff, and general researchers.

Our plan (fingers crossed!) is to officially launch the new Duke Libraries catalog to all users in early 2019, perhaps as soon as the start of the spring semester. A local implementation team is already at work to be sure we’re ready to replace Duke’s old catalog with the new and improved version early next year. Meanwhile, development and interface enhancement of the catalog will continue this fall. While we are pleased with what we’ve accomplished over the last 18 months, there is still significant work to be done before we’ll be ready to go live. Here are a few items on the lengthy TO DO list:

  • finish loading the 16 million records from all four Triangle Research libraries
  • integrate Duke’s request workflows so users can request items they discover in the new catalog
  • develop a robust Advanced Search interface in response to user demand
  • tune relevance ranking
  • ensure that non-Roman scripts are searchable and display correctly
  • map non-MARC metadata so items such as digital collections records are discoverable
Effective search and display of non-Roman scripts is just one of the many items left on our list before we launch the library catalog to the public.

There is a lot of work ahead to be sure, but what we will launch to staff next week is a functional catalog with nearly 10 million records, and that number is increasing by the day. We invite you to take the new catalog for a spin and tell us what you think so we can make improvements and be ready for all researchers in just a few short months.

Charm City Sounds

Last week I had the opportunity to attend the 52nd Association for Recorded Sound Collections Annual Conference in Baltimore, MD.  From the ARSC website:

Founded in 1966, the Association for Recorded Sound Collections, Inc. is a nonprofit organization dedicated to the preservation and study of sound recordings—in all genres of music and speech, in all formats, and from all periods.

ARSC is unique in bringing together private individuals and institutional professionals. Archivists, librarians, and curators representing many of the world’s leading audiovisual repositories participate in ARSC alongside record collectors, record dealers, researchers, historians, discographers, musicians, engineers, producers, reviewers, and broadcasters.

ARSC’s vitality springs from more than 1000 knowledgeable, passionate, helpful members who really care about sound recordings.

ARSC Annual Conferences encourage open sharing of knowledge through informative presentations, workshops, and panel discussions. Tours, receptions, and special local events heighten the camaraderie that makes ARSC conferences lively and enjoyable.

This quote highlights several of the things that have made ARSC resources valuable and educational to me as the Audio Production Specialist at Duke Libraries:

  • The group’s membership includes both professionals and enthusiasts from a variety of backgrounds and types of institutions.
  • Members’ interests and specialties span a broad array of musical genres, media types, and time periods.
  • The organization serves as a repository of knowledge on obscure and obsolete sound recording media and technology.

This year’s conference offered a number of presentations that were directly relevant to our work here in Digital Collections and Curation Services, highlighting audio collections that have been digitized and the challenges encountered along the way.  Here’s a quick recap of some that stood out to me:

  • “Uncovering the Indian Neck Folk Festival Collection” by Maya Lerman (Folklife Center, Library of Congress).  This presentation showcased a collection of recordings and related documentation from a small invitation-only folk festival that ran from 1961-2014 and included early performances from Reverend Gary Davis, Dave Van Ronk, and Bob Dylan.  It touched on some of the difficulties in archiving optical and born-digital media (lack of metadata, deterioration of CD-Rs) as well as the benefits of educating prospective donors on best practices for media and documentation.
  • “A Garage in South Philly: The Vernacular Music Research Archive of Thornton Hagert” by David Sager and Anne Stanfield-Hagert.  This presentation paid tribute to the massive jazz archive of the late Mr. Hagert, comprising over 125,000 items of printed music, 75,000 recordings, 5,500 books, and 2,000 periodicals.  It spoke to the difficulties of selling or donating a private collection of this magnitude without splitting it up and undoing the careful, but idiosyncratic organizational structure as envisioned by the collector.
  • “Freedom is a Constant Struggle: The Golden State Mutual Sound Recordings” by Kelly Besser, Yasmin Dessem and Shanni Miller (UCLA Library).  This presentation covered the audio material from the archive of an African American-owned insurance company founded in 1925 in the Bay Area.  While audio was only a small part of this larger collection, the speakers demonstrated how it added additional context and depth to photographs, video, and written documents.  They also showed how this kind of archival audio can be an important tool in telling the stories of previously suppressed or unheard voices.
  • “Sounds, Sights and Sites of Activism in ’68” by Guha Shankar (Library of Congress).  This presentation examined a collection of recordings from “Resurrection City” in Washington, DC.  This was an encampment that was part of the Poor People’s Campaign, a demonstration for human rights organized by Martin Luther King, Jr. prior to his assassination in 1968.  The talk showed how these archival documents are being accessed and used to inform new forms of social and political activism and wider circulation via podcasts, websites, public lecture and exhibitions.

The ARSC Conference also touched on my personal interests in American traditional and vernacular music, especially folk and blues from the early 20th Century.  Presentations on the bluegrass scene in Baltimore, blues guitarist Johnny Shines, education outreach by the creators of PBS’s “American Epic” documentaries, and Hickory, NC’s own Blue Sky Boys provided a welcome break from favorite archivist topics such as metadata, workflows, and quality control.  Other fun parts of the conference included an impromptu jam session, a silent auction of books & records, and posters documenting the musical history of Baltimore.  True to the city’s nickname, I was charmed by my time in Baltimore and inspired by the amazingly diverse and dedicated work towards collecting and preserving our audio heritage by the ARSC community.

 

 

Living Our Best DSpace Lives

Last week, an indefatigable team at Duke University Libraries released an upgraded version of the DukeSpace platform, completing  the first phase of the critical project that I wrote about in this space in January.  One member of the team remarked that we now surely have “one of the best DSpaces in the world,” and I dare anyone to prove otherwise.

DukeSpace serves as the Libraries’ open-access institutional repository, which makes it a key aspect of our mission to “partner in research,” as outlined in our strategic plan.  As I wrote in January, the version of the DSpace platform that underlies the service had been stuck at 1.7, which was released during 2010 – the year the iPad came out, and Lady Gaga wore a meat dress. We upgraded to version 6.2, though the differences between the two versions are so great that it would be more accurate to call the project a migration.

That migration turned out to be one of the more complex technology projects we’ve undertaken over the years. The main complicating factor was the integration with Symplectic Elements, the Research Information Management System (RIMS) that powers the Scholars at Duke site. As far as we know, we are the first institution to integrate Elements with DSpace 6.2. It was a beast to do, and we are happy to share our knowledge gained if it will help any of our peers out there trying to do the same thing.

Meanwhile, feel free to click on over to and enjoy one of the best DSpaces in the world. And congratulations to one of the mightiest teams assembled since Spain won the World Cup!

Interactive Transcripts have Arrived!

Interactive Transcripts have Arrived!

This week Duke Digital Collections added our first set of interactive transcripts to one of our newest digital collections: the Silent Vigil (1968) and Allen Building Takeover (1969) collection of audio recordings.   This marks an exciting milestone in the accessibility efforts Duke University Libraries has been engaged in for the past 2.5 years. Last October, my colleague Sean wrote about our new accessibility features and the technology powering them, and today I’m going to tell you a little more about why we started these efforts as well as share some examples.

Interactive Transcript in the Silent Vigil (1968) and Allen Building Takeover (1969) Audio Recordings

Providing access to captions and transcripts is not new for digital collections.  We have been able to provide access to pdf transcripts and caption both in digital collections and finding aids for years. See items from the Behind the Veil and Memory Project digital collections for examples.

In recent years however, we stepped our efforts in creating captions and transcripts. Our work began in response to a 2015 lawsuit brought against Harvard and MIT by the National Association of the Deaf. The lawsuit triggered many discussions in the library, and the Advisory Council for Digital Collections eventually decided that we would proactively create captions or transcripts for all new A/V digital collections assuming it is feasible and reasonable to do so.  The feasible and reasonable part of our policy is key.  The Radio Haiti collection for example is composed of thousands of recordings primarily in Haitian Creole and French.  The costs to transcribe that volume of material in non-English languages make it unreasonable (and not feasible) to transcribe. In addition to our work in the library, Duke has established campus wide web accessibility guidelines that includes captioning and  transcription.  Therefore our work in digital collections is only one aspect of campus wide accessibility efforts.

To create transcripts and captions, we have partnered with several vendors since 2015, and we have seen the costs for these services drop dramatically.  Our primary vendor right now is Rev, who also works with Duke’s Academic Media Services department.  Rev guarantees 99% accurate captions or transcripts for $1/minute.

Early on, Duke Digital Collections decided to center our captioning efforts around the WebVTT format, which is a time-coded text based file and a W3C standard.  We use it for both audio and video captions when possible, but we can also accommodate legacy transcript formats like pdfs.  Transcripts and captions can be easily replaced with new versions if and when edits need to be made.

Examples from the Silent Vigil (1968) and Allen Building Takeover (1969) Audio Recordings

When WebVTT captions are present, they load in the interface as an interactive transcript.  This transcript can be used for navigation purposes; click the text and the file moves to that portion of the recording.

Click the image above to see the full item and transcript.

In addition to providing access to transcripts on the screen, we offer downloadable versions of the WebVTT transcript as a text file, a pdf or in the original webVTT format.

An advantage of the WebVTT format is that it includes “v” tags, which can be used to note changes in speakers and one can even add names to the transcript.  This can require additional  manual work if the names of the speakers is not obvious to the vendor, but we are excited to have this opportunity.

As Sean described in his blog post, we can also provide access to legacy pdf documents.  They cannot be rendered into an interactive version, but they are still accessible for download.

On a related note, we also have a new feature that links time codes listed in the description metadata field of an item to the corresponding portion of the audio or video file.  This enables librarians to describe specific segments of audio and/or video items.  The Radio Haiti digital collection is the first to utilize this feature, but the feature will be a huge benefit to the H. Lee Waters and Chapel Recordings digital collections as well as many others.

Click the image above to interact with linked time codes.

As mentioned at the top of this post, the Duke Vigil and Allen Building Takeover collection includes our first batch of interactive transcripts.  We plan to launch more this Spring, so stay tuned!!

Moving the mountain (of data)

It’s a new year! And a new year means new priorities. One of the many projects DUL staff have on deck for the Duke Digital Repository in the coming calendar year is an upgrade to DSpace, the software application we use to manage and maintain our collections of scholarly publications and electronic theses and dissertations. As part of that upgrade, the existing DSpace content will need to be migrated to the new software. Until very recently, that existing content has included a few research datasets deposited by Duke community members. But with the advent of our new research data curation program, research datasets have been published in the Fedora 3 part of the repository. Naturally, we wanted all of our research data content to be found in one place, so that meant migrating the few existing outliers. And given the ongoing upgrade project, we wanted to be sure to have it done and out of the way before the rest of the DSpace content needed to be moved.

The Integrated Precipitation and Hydrology Experiment

Most of the datasets that required moving were relatively small–a handful of files, all of manageable size (under a gigabyte) that could be exported using DSpace’s web interface. However, a limited series of data associated with a project called The Integrated Precipitation and Hydrology Experiment (IPHEx) posed a notable exception. There’s a lot of data associated with the IPHEx project (recorded daily for 7 years, along with some supplementary data files, and iterated over 3 different areas of coverage, the total footprint came to just under a terabyte, spread over more than 7,000 files), so this project needed some advance planning.

First, the size of the project meant that the data were too large to export through the DSpace web client, so we needed the developers to wrangle a behind the scenes dump of what was in DSpace to a local file system. Once we had everything we needed to work with (which included some previously unpublished updates to the data we received last year from the researchers), we had to make some decisions on how to model it. The data model used in DSpace was a bit limiting, which resulted in the data being made available as a long list of files for each part of the project. In moving the data to our Fedora repository, we gained a little more flexibility with how we could arrange the files. We determined that we wanted to deviate slightly from the arrangement in DSpace, grouping the files by month and year.

This meant we would have group all the files into subdirectories containing the data for each month–for over 7,000 files, that would have been extremely tedious to do by hand, so we wrote a script to do the sorting for us. That completed, we were able to carry out the ingest process as normal. The final wrinkle associated with the IPHEx project was making sure that the persistent identifiers each part of the project data had been assigned in DSpace still resolved to the correct content. One of our developers was able to set up a server redirect to ensure that each URL would still take a user to the right place. As of the new year, the IPHEx project data (along with our other migrated DSpace datasets) are available in their new home!

At least (of course) until the next migration.

A Year in the Life of Digital Collections

2017 has been an action packed year for Digital Collections full of exciting projects, interface developments and new processes and procedures. This blog post is an attempt to summarize just a few of our favorite accomplishments from the last year. Digital Collections is truly a group cross-departmental collaboration here at Duke, and we couldn’t do complete any of the work listed below without all our colleagues across the library – thanks to all!

New Digital Collections Portal

Regular visitors to Duke Digital Collections may have noticed that our old portal (library.duke.edu/digitalcollections/) now redirects to our new homepage on the Duke Digital Repository (DDR) public interface. We are thrilled to make this change! But never fear, your favorite collections that have not been migrated to DDR are still accessible either on our Tripod2 interface or by visiting new placeholder landing pages in the Digital Repository.

Audiovisual Features

Supporting A/V materials in the Digital Repository has been a major software development priority throughout 2017. As a result our A/V items are becoming more accessible and easier to share. Thanks to a year of hard work we can now do and support the following (we posted examples of these on a previous post).

  • Model, store and stream A/V derivatives
  • Share A/V easily through our embed feature (even on Duke WordPress sites- a long standing bug)
  • Finding aids can now display inline AV for DAOs from DDR
  • Clickable timecode links in item descriptions (example)
  • Display captions and interactive transcripts
  • Download and Export captions and transcripts (as .pdf, .txt.,or .vtt)
  • Display Video thumbnails & poster frames

Rights Statements and Metadata

Bitstreams recently featured a review of all things metadata from 2017, many of which impact the digital collections program. We are especially pleased with our rights management work from the last year and our rights statements implementation (http://rightsstatements.org/en/). We are still in the process of retrospectively applying the statements, but we are making good progress. The end result will give our patrons a clearer indication of the copyright status of our digital objects and how they can be used. Read more about our rights management work in past Bitstreams posts.

Also this year in metadata, we have been developing integrations between ArchivesSpace (the tool Rubenstein Library uses for finding aids) and the repository (this is a project that has been in the works since 2015. With these new features Rubenstein’s archivist for metadata and encoding is in the process of reconciling metadata between ArchivesSpace and the Digital Repository for approximately 50 collections to enable bi-directional links between the two systems. Bi-directonal linking helps our patrons move easily from a digital object in the repository to its finding aid or catalog record and vice versa. You can read about the start of this work in a blog post from 2016.

MSI

At the end of 2016, Duke Libraries purchased Multispectral Imaging (MSI) equipment, and members of Digital Collections, Data and Visualization Studies, Conservation Services, the Duke Collaboratory for Classics Computing, and the Rubenstein Library joined forces to explore how to best use the technology to serve the Duke community. The past year has been a time of research, development, and exploration around MSI and you can read about our efforts on Bitstreams. Our plan is to launch an MSI service in 2018. Stay tuned!

Ingest into the Duke Digital Repository (DDR)

With the addition of new colleagues focussed on research data management, there have been more demands on and enhancements to our DDR ingest tools. Digital collections has benefited from more robust batch ingest features as well as the ability to upload more types of files (captions, transcripts, derivatives, thumbnails) through the user interface. We can also now ingest nested folders of collections. On the opposite side of the spectrum we now have the ability to batch export sets of files or even whole collections.

Project Management

The Digital Collections Advisory Committee and Implementation Team are always looking for more efficient ways to manage our sprawling portfolio of projects and services. We started 2017 with a new call for proposals around the themes of diversity and inclusion, which resulted in 7 successful proposals that are now in the process of implementation.

In addition to a thematic call for proposals, we later rolled out a new process for our colleagues to propose smaller projects in response to faculty requests, events or for other reasons. In other words, projects of a certain size and scope that were not required to respond to a thematic call for proposals. The idea being that these projects can be easily implemented, and therefore do not require extensive project management to complete. Our first completed “easy” project is the Carlo Naya photograph albums of Venice.

In 2016 (perhaps even back in 2015), the digital collections team started working with colleagues in Rubenstein to digitize the set of collections known in as “Section A”. The history of this moniker is a little uncertain, so let me just say that Section A is a set of 3000+ small manuscript collections (1-2 folders each) boxed together; each Section A box holds up to 30 collections. Section A collections are highly used and are often the subject of reproduction requests, hence they are perfect candidates for digitization. Our goal has been to set up a mass-digitization pipeline for these collections, that involves vetting rights, updating description, evaluating their condition, digitizing them, ingesting them into DDR, crosswalking metadata and finally making them publicly accessible in the repository and through their finding aids. In 2017 we evaluated 37 boxes for rights restrictions, updated descriptions for 24 boxes, assessed the condition of 31 boxes, digitized 19 boxes, ingested 4 boxes, crosswalked metadata for 2 boxes and box 1 is now online! Read more about the project in a May Bitstreams post. Although progress has felt slow given all the other projects we manage simultaneously, we really feel like our foot is on the gas now!

You can see the fruits of our digital collection labors in the list of new and migrated collections from the past year. We are excited to see what 2018 will bring!!

New Collections Launched in 2017

Migrated Collections

Change is afoot in Software Development and Integration Services

We’re experimenting with changing our approach to projects in Software Development and Integration Services (SDIS). There’s been much talk of Agile (see the Agile Manifesto) over the past few years within our department, but we’ve faced challenges implementing this as an approach to our work given our broad portfolio, relatively small team, and large number of internal stakeholders.

After some productive conversations among staff and managers in SDIS where we reflected on our work over the past few years we decided to commit to applying the Scrum framework to one or more projects.

Scrum Framework
Source: https://commons.wikimedia.org/wiki/File:Scrum_Framework.png

There are many resources available for learning about Agile and Scrum. The resources I’ve found most useful so far in learning about the framework include:

Scrum seems best suited to developing new products or software and defines the roles, workflow, and artifacts that help a team make the most of its capacity to build the highest value features first and deliver usable software on a regular and frequent schedule.

To start, we’ll be applying this process to a new project to build a prototype of a research data repository based on Hyrax. We’ve formed a small team, including a product owner, scrum master, and development team to build the repository. So far, we’ve developed an initial backlog of requirements in the form of user stories in Jira, the software we use to manage projects. We’ve done some backlog refinement to prioritize the most important and highest value features, and defined acceptance criteria for the ones that we’ll consider first. The development team has estimated the story points (relative estimate of effort and complexity) for some of the user stories to help us with sprint planning and release projection. Our first two-week sprint will begin the week after Thanksgiving. By the end of January we expect to have completed four, two-week sprints and have a pilot ready with a basic set of features implemented for evaluation by internal stakeholders.

One of the important aspects of Scrum is that group reflection on the process itself is built into the workflow through retrospective meetings after each sprint. Done right, routine retrospectives serve to reinforce what is working well and allows for adjustments to address things that aren’t. In the future we hope to adapt what we learn from applying the Scrum framework to the research data repository pilot to improve our approach to other aspects of our work in SDIS.

Voices from the Movement

This past year the SNCC Digital Gateway has brought a number of activists to Duke’s campus to discuss lesser known aspects of the Student Nonviolent Coordinating Committee (SNCC)’s history and how their approach to organizing shifted over time. These sessions ranged the development of the symbol of the Black Panther for the Lowndes County Freedom Party, the strength of local people in the Movement in Southwest Georgia, and the global network supporting SNCC’s fight for Black empowerment in the U.S. and across the African Diaspora. Next month, there will be a session focused on music in the Movement, with a public panel the evening of September 19th.

Screenshot of “Born into the Movement,” from the “Our Voices” section on the SNCC Digital Gateway.

These visiting activist sessions, often spanning the course of a few days, produce hours of audio and video material, as SNCC veterans reengage with the history through conversation with their comrades. And this material is rich, as memories are dusted off and those involved explore how and why they did what they did. However, considering the structure of the SNCC Digital Gateway and wanting to make these 10 hour collections of A/V material digestible and accessible, we’ve had to develop a means of breaking them down.

Step One: Transcription

As is true for many projects, you begin by putting pen to paper (or by typing furiously). With the amount of transcribing that we do for this project, we’re certainly interested in making the process as seamless as possible. We depend on ExpressScribe, which allows you to set hot keys to start, stop, rewind, and fast forward audio material. Another feature is that you can easily adjust the speed at which the recording is being played, which is helpful for keeping your typing flow steady and uninterrupted. For those who really want to dive in, there is a foot pedal extension (yes, one did temporarily live in our project room) that allows you to control the recording with your feet – keeping your fingers even more free to type at lightning speed. After transcribing, it is always good practice to review the transcription, which you can do efficiently while listening to a high speed playback.

Step Two: Selecting Clips

Once these have been transcribed (each session results in approximately a 130 page transcript, single-spaced), it is time to select clips. For the parameters of this project, we keep the clips roughly between 30 seconds and 8 minutes and intentionally try to pull out the most prominent themes from the conversation. We then try to fit our selections into a larger narrative that tells a story. This process takes multiple reviews of the material and a significant amount of back and forth to ensure that the narrative stays true to the sentiments of the entire conversation.

The back-end of one of our pages.

Step Three: Writing the Narrative

We want users to listen to all of the A/V material, but sometimes details need to be laid out so that the clips themselves make sense. This is where the written narrative comes in. Without detracting from the wealth of newly-created audio and video material, we try to fill in some of the gaps and contextualize the clips for those who might be less familiar with the history. In addition to the written narrative, we embed relevant documents and photographs that complement the A/V material and give greater depth to the user’s experience.

Step Four: Creating the Audio Files

With all of the chosen clips pulled from the transcript, it’s time to actually make the audio files. For each of these sessions, we have multiple recorders in the room, in order to ensure everyone can be heard on the tape and that none of the conversation is lost due to recorder malfunction. These recorders are set to record in .WAV files, an uncompressed audio format for maximum audio quality.

One complication with having multiple mics in the room, however, is that the timestamps on the files are not always one-to-one. In order to easily pull the clips from the best recording we have, we have to sync the files. Our process involves first creating a folder system on an external hard drive. We then create a project in Adobe Premiere and import the files. It’s important that these files be on the same hard drive as the project file so that Premiere can easily find them. Then, we make sequences of the recordings and match the waveform from each of the mics. With a combination of using the timestamps on the transcriptions and scrubbing through the material, it’s easy to find the clips we need. From there, we can make any post-production edits that are necessary in Adobe Audition and export them as .mp3 files with Adobe Media Encoder.

Step Five: Uploading & Populating

Due to the SNCC Digital Gateway’s sustainability requirements, we host the files in a Duke Digital Collections folder and then embed them in the website, which is built on a WordPress platform. These files are then formatted between text, document, and image, to tell a story.