All posts by Molly Bragg

We are Hiring: Digital Repository Content Analyst

Duke University Libraries (DUL) is recruiting a Digital Repository Content Analyst to help us ingest and manage content in our digital preservation systems and platforms.  This position will partner with the Research Data Curation Program, Digital Collections Program, and various other departments around the Library and on campus to provide curation and preservation services.  This is an excellent entry level opportunity for anyone who enjoys managing large sets of data and/or files, working with colleagues across an organization, preserving essential data and library collections, and learning new technical skills.

Ideal applicants have been exposed to technical systems and file management techniques such as command line scripting, can communicate functional system requirements between groups with varying types of expertise, enjoys working with different types of data/collections, and loves solving problems.  The successful candidate will join the highly collaborative Digital Collections and Curation Services department (within the Digital Strategies and Technology Division) at DUL.

For a full job description please see https://library.duke.edu/about/jobs/digitalrepositorycontentanalyst. To apply, submit an electronic resume, cover letter, and list of 3 references: https://hr.duke.edu/careers/apply – refer to requisition #401537489. Review of applications will begin immediately and will continue until the position is filled.

Digital Collections Round Up 2018

It’s that item of year where we like to reflect on all we have done in 2018, and your favorite digital collections and curation services team is no exception.  This year, Digital Collections and Curation Services have been really focusing on getting collections and data into the Digital Repository and making it accessible to the world!

As you will see from the list below we launched 320 new digital collections, managed substantial additions to 2, and migrated 8. However, these publicly accessible digital collections are just the tip of the iceberg in terms of our work in Digital Collections and Curation Services.

A cover from the Ladyslipper, Inc Retail Catalogs digital collection.

So much more digitization happens behind the scenes than is reflected in the list of new digital collections.  Many of our larger projects are years in the making. For example, we continue to digitize Gedney photographs and we hope to make them publicly accessible next year.  There is also preservation digitization happening that we cannot make publicly accessible online.  This work is essential to our preservation mission, though we cannot share the collections widely in the short term.

We strongly believe in keeping our metadata healthy, so in addition to managing new metadata, we often revisit existing metadata across our repositories in order to ensure its overall quality and functionality.

Our team is also responsible for ingesting not just digital collections, but research data and library collections as well.  We preserved 20 datasets produced by the Duke Scholarly Community in the Research Data Repository (https://research.repository.duke.edu/)  via the Research Data Curation program https://library.duke.edu/data/data-management.

A selection from the Buffalo Bill papers, digitized as part of the Section A project.

New Digital Collections 2018

Additions to Existing Digital Collections

The men’s basketball team celebrates its 1991 championship win.

Collections Migrated into the Digital Repository

We are Hiring!

Duke University Libraries is recruiting a Digital Production Services Manager to direct the operations of our Digital Production Center, its staff (3 FTE plus student assistants), and associated digitization services. We are seeking someone experienced in leading digitization projects who is excited to partner with colleagues around the library to reformat and preserve unique library collections and provide access to them online. This is an excellent opportunity for someone who likes working with people, projects, and primary sources!

This newly created position combines people and project management responsibilities with hands-on digitization duties. Previous supervisory experience is not required; however, the ability to direct the work of others is essential to this position, as is a service oriented attitude. Strong organizational and project management skills are also a must. Some form of digitization experience in a library or other cultural heritage setting is required for this role as well. The successful candidate will join the highly collaborative Digital Collections and Curation Services department and work under the direct supervision of the department head.

The Digital Production Center (DPC) is a specialized unit that creates digital surrogates of primary resources from Duke University Libraries collections for the purposes of preservation and access. Learn more about the DPC on our web page, or through the Digital Strategies and Technology division’s blog, Bitstreams. To see some of the materials we have digitized, check out Duke Digital Collections online.

Duke is a diverse community committed to the principles of excellence, fairness, and respect for all people. As part of this commitment, we actively value diversity in our workplace and learning environments as we seek to take advantage of the rich backgrounds and abilities of everyone. We believe that when we understand, celebrate, and tap into our uniqueness to creatively solve problems and address shared goals, our possibilities are limitless. Duke University Libraries value diversity of thought, perspective, experience, and background and are actively committed to a culture of inclusion and respect.

Duke offers a comprehensive benefit package, which includes both traditional benefits such as health insurance, leave time and retirement, as well as wide ranging work/life and cultural benefits. Details can be found at: http://www.hr.duke.edu/benefits/index.php.

For a full job description please see https://library.duke.edu/about/jobs/dpsmanager. To apply, submit an electronic resume, cover letter, and list of 3 references: https://hr.duke.edu/careers/apply – refer to requisition #401463554. Review of applications will begin immediately and will continue until the position is filled.

Multispectral Imaging Summer Snapshots

If you are a regular Bitstreams reader, you know we just love talking about Multispectral Imaging.  Seriously, we can go on and on about it, and we are not the only ones.   This week however we are keeping it short and sweet and sharing a couple before and after images from one of our most recent imaging sessions.

Below are two stacked images of Ashkar MS 16 (from the Rubenstein Library).  The top half of each image is the manuscript under natural light, and the bottom are the results of Multispectral imaging and processing.  We tend to post black and white MSI images most often as they are generally the most legible, however our MSI software can produce a lot of wild color variations!  The orange one below seemed the most appropriate for a hot NC July afternoon like today.  More processing details are included in the image captions below – enjoy!

The text of this manuscript above was revealed primarily with the IR narrowband light at 780 nm.
This image was created using Teem, a tool used to process and visualize scientific raster data. This specific image is the result of flatfielding each wavelength image and arranging them in wavelength order to produce a vector for each pixel. The infinity norm is computed for each vector to produce a scalar value for each pixel which is then histogram-equalized and assigned a color by a color-mapping function.

New and Migrated Digital Collections Round up

We are halfway through 2018, and so it seems like a fitting time to share new and newly migrated digital collections.  

Digital Collections Launched or Migrated since January 1 2018:
These collections should be publicly accessible in late June or early July:

Looking ahead to the rest of the year, we will have more Radio Haiti recordings, 1990s issues of the Duke Chronicle, the Josephine Leary papers, more of your favorite legacy digital collections moving over to the digital repository and so much more! Stay tuned!

Announcing Multispectral Imaging Service, version 1

As regular Bitstreams readers know, a cross departmental team within Duke University Libraries has been exploring Multispectral Imaging and its potential to make Duke collections more accessible to researchers in the Duke scholarly community and beyond since 2015. After spending 2017 developing MSI workflows, building expertise, writing documentation, and responding to experimental imaging requests, we are now ready to unveil the first version of Duke University Libraries MSI service for researchers!

Our first service model version accommodates small requests that are not urgent. The MSI team wants to partner with researchers to facilitate their requests as well as hear feedback about our current service and any other needs for MSI. We are offering MSI services for free for the next few months, but will institute a fee structure this Summer.

The service breaks down into 4 general steps:

  • First, researchers submit a request for MSI services using a webform. The form prompts requesters to share their research question and details about what they want imaged. We also want to know where researchers are from, as we are expecting both Duke and non-Duke affiliated patrons.
  • Second, the MSI team will review all requests, as MSI is not the ideal imaging solution for all materials and research questions. Requests that will not benefit from MSI will not be approved.
  • Third, we schedule approved requests for imaging and processing. We plan to conduct 1 imaging and processing day per month, so it may take several weeks to a month for approved requests to make it though our full process.
  • Fourth, we deliver the processed files to our patrons along with a report that details the imaging and processing procedures and outcomes.

Please note the following:
We are currently only imaging Duke University Library holdings.
We are limiting requests to 1-3 individual items or 1-3 pages within a bound item (which is the number of items we can generally image and process in 1 day).
Allow 2-4 weeks for vetting and up to a month for imaging.

If you are interested in requesting MSI services, but your needs do not fit the service described here, we still want to hear from you! Please do not hesitate to fill out our researcher request form to get the process started, or contact Susan Ivey directly.

If you want to learn more about MSI, check out the recent talk we gave at the Friday Visualization Forum on February 23.

Interactive Transcripts have Arrived!

Interactive Transcripts have Arrived!

This week Duke Digital Collections added our first set of interactive transcripts to one of our newest digital collections: the Silent Vigil (1968) and Allen Building Takeover (1969) collection of audio recordings.   This marks an exciting milestone in the accessibility efforts Duke University Libraries has been engaged in for the past 2.5 years. Last October, my colleague Sean wrote about our new accessibility features and the technology powering them, and today I’m going to tell you a little more about why we started these efforts as well as share some examples.

Interactive Transcript in the Silent Vigil (1968) and Allen Building Takeover (1969) Audio Recordings

Providing access to captions and transcripts is not new for digital collections.  We have been able to provide access to pdf transcripts and caption both in digital collections and finding aids for years. See items from the Behind the Veil and Memory Project digital collections for examples.

In recent years however, we stepped our efforts in creating captions and transcripts. Our work began in response to a 2015 lawsuit brought against Harvard and MIT by the National Association of the Deaf. The lawsuit triggered many discussions in the library, and the Advisory Council for Digital Collections eventually decided that we would proactively create captions or transcripts for all new A/V digital collections assuming it is feasible and reasonable to do so.  The feasible and reasonable part of our policy is key.  The Radio Haiti collection for example is composed of thousands of recordings primarily in Haitian Creole and French.  The costs to transcribe that volume of material in non-English languages make it unreasonable (and not feasible) to transcribe. In addition to our work in the library, Duke has established campus wide web accessibility guidelines that includes captioning and  transcription.  Therefore our work in digital collections is only one aspect of campus wide accessibility efforts.

To create transcripts and captions, we have partnered with several vendors since 2015, and we have seen the costs for these services drop dramatically.  Our primary vendor right now is Rev, who also works with Duke’s Academic Media Services department.  Rev guarantees 99% accurate captions or transcripts for $1/minute.

Early on, Duke Digital Collections decided to center our captioning efforts around the WebVTT format, which is a time-coded text based file and a W3C standard.  We use it for both audio and video captions when possible, but we can also accommodate legacy transcript formats like pdfs.  Transcripts and captions can be easily replaced with new versions if and when edits need to be made.

Examples from the Silent Vigil (1968) and Allen Building Takeover (1969) Audio Recordings

When WebVTT captions are present, they load in the interface as an interactive transcript.  This transcript can be used for navigation purposes; click the text and the file moves to that portion of the recording.

Click the image above to see the full item and transcript.

In addition to providing access to transcripts on the screen, we offer downloadable versions of the WebVTT transcript as a text file, a pdf or in the original webVTT format.

An advantage of the WebVTT format is that it includes “v” tags, which can be used to note changes in speakers and one can even add names to the transcript.  This can require additional  manual work if the names of the speakers is not obvious to the vendor, but we are excited to have this opportunity.

As Sean described in his blog post, we can also provide access to legacy pdf documents.  They cannot be rendered into an interactive version, but they are still accessible for download.

On a related note, we also have a new feature that links time codes listed in the description metadata field of an item to the corresponding portion of the audio or video file.  This enables librarians to describe specific segments of audio and/or video items.  The Radio Haiti digital collection is the first to utilize this feature, but the feature will be a huge benefit to the H. Lee Waters and Chapel Recordings digital collections as well as many others.

Click the image above to interact with linked time codes.

As mentioned at the top of this post, the Duke Vigil and Allen Building Takeover collection includes our first batch of interactive transcripts.  We plan to launch more this Spring, so stay tuned!!

A Year in the Life of Digital Collections

2017 has been an action packed year for Digital Collections full of exciting projects, interface developments and new processes and procedures. This blog post is an attempt to summarize just a few of our favorite accomplishments from the last year. Digital Collections is truly a group cross-departmental collaboration here at Duke, and we couldn’t do complete any of the work listed below without all our colleagues across the library – thanks to all!

New Digital Collections Portal

Regular visitors to Duke Digital Collections may have noticed that our old portal (library.duke.edu/digitalcollections/) now redirects to our new homepage on the Duke Digital Repository (DDR) public interface. We are thrilled to make this change! But never fear, your favorite collections that have not been migrated to DDR are still accessible either on our Tripod2 interface or by visiting new placeholder landing pages in the Digital Repository.

Audiovisual Features

Supporting A/V materials in the Digital Repository has been a major software development priority throughout 2017. As a result our A/V items are becoming more accessible and easier to share. Thanks to a year of hard work we can now do and support the following (we posted examples of these on a previous post).

  • Model, store and stream A/V derivatives
  • Share A/V easily through our embed feature (even on Duke WordPress sites- a long standing bug)
  • Finding aids can now display inline AV for DAOs from DDR
  • Clickable timecode links in item descriptions (example)
  • Display captions and interactive transcripts
  • Download and Export captions and transcripts (as .pdf, .txt.,or .vtt)
  • Display Video thumbnails & poster frames

Rights Statements and Metadata

Bitstreams recently featured a review of all things metadata from 2017, many of which impact the digital collections program. We are especially pleased with our rights management work from the last year and our rights statements implementation (http://rightsstatements.org/en/). We are still in the process of retrospectively applying the statements, but we are making good progress. The end result will give our patrons a clearer indication of the copyright status of our digital objects and how they can be used. Read more about our rights management work in past Bitstreams posts.

Also this year in metadata, we have been developing integrations between ArchivesSpace (the tool Rubenstein Library uses for finding aids) and the repository (this is a project that has been in the works since 2015. With these new features Rubenstein’s archivist for metadata and encoding is in the process of reconciling metadata between ArchivesSpace and the Digital Repository for approximately 50 collections to enable bi-directional links between the two systems. Bi-directonal linking helps our patrons move easily from a digital object in the repository to its finding aid or catalog record and vice versa. You can read about the start of this work in a blog post from 2016.

MSI

At the end of 2016, Duke Libraries purchased Multispectral Imaging (MSI) equipment, and members of Digital Collections, Data and Visualization Studies, Conservation Services, the Duke Collaboratory for Classics Computing, and the Rubenstein Library joined forces to explore how to best use the technology to serve the Duke community. The past year has been a time of research, development, and exploration around MSI and you can read about our efforts on Bitstreams. Our plan is to launch an MSI service in 2018. Stay tuned!

Ingest into the Duke Digital Repository (DDR)

With the addition of new colleagues focussed on research data management, there have been more demands on and enhancements to our DDR ingest tools. Digital collections has benefited from more robust batch ingest features as well as the ability to upload more types of files (captions, transcripts, derivatives, thumbnails) through the user interface. We can also now ingest nested folders of collections. On the opposite side of the spectrum we now have the ability to batch export sets of files or even whole collections.

Project Management

The Digital Collections Advisory Committee and Implementation Team are always looking for more efficient ways to manage our sprawling portfolio of projects and services. We started 2017 with a new call for proposals around the themes of diversity and inclusion, which resulted in 7 successful proposals that are now in the process of implementation.

In addition to a thematic call for proposals, we later rolled out a new process for our colleagues to propose smaller projects in response to faculty requests, events or for other reasons. In other words, projects of a certain size and scope that were not required to respond to a thematic call for proposals. The idea being that these projects can be easily implemented, and therefore do not require extensive project management to complete. Our first completed “easy” project is the Carlo Naya photograph albums of Venice.

In 2016 (perhaps even back in 2015), the digital collections team started working with colleagues in Rubenstein to digitize the set of collections known in as “Section A”. The history of this moniker is a little uncertain, so let me just say that Section A is a set of 3000+ small manuscript collections (1-2 folders each) boxed together; each Section A box holds up to 30 collections. Section A collections are highly used and are often the subject of reproduction requests, hence they are perfect candidates for digitization. Our goal has been to set up a mass-digitization pipeline for these collections, that involves vetting rights, updating description, evaluating their condition, digitizing them, ingesting them into DDR, crosswalking metadata and finally making them publicly accessible in the repository and through their finding aids. In 2017 we evaluated 37 boxes for rights restrictions, updated descriptions for 24 boxes, assessed the condition of 31 boxes, digitized 19 boxes, ingested 4 boxes, crosswalked metadata for 2 boxes and box 1 is now online! Read more about the project in a May Bitstreams post. Although progress has felt slow given all the other projects we manage simultaneously, we really feel like our foot is on the gas now!

You can see the fruits of our digital collection labors in the list of new and migrated collections from the past year. We are excited to see what 2018 will bring!!

New Collections Launched in 2017

Migrated Collections

New and Recently Migrated Digital Collections

In the past 3 months, we have launched a number of exciting digital collections!  Our brand new offerings are either available now or will be very soon.  They are:

  • Duke Property Plats: https://repository.duke.edu/dc/uapropplat
  • Early Arabic Manuscripts (included in the recently migrated Early Greek Manuscripts): https://repository.duke.edu/dc/earlymss
  • International Broadsides (added to migrated Broadsides and Ephemera collection): https://repository.duke.edu/dc/broadsides
  • Orange County Tax List Ledger, 1875: https://repository.duke.edu/dc/orangecountytaxlist
  • Radio Haiti Archive, second batch of recordings: https://repository.duke.edu/dc/radiohaiti
  • William Gedney Finished Prints and Contact Sheets (newly re-digitized with new and improved metadata): https://repository.duke.edu/dc/gedney
A selection from the William Gedney Photographs digital collection

In addition to the brand new items, the digital collections team is constantly chipping away at the digital collections migration.  Here are the latest collections to move from Tripod 2 to the Duke Digital Repository (these are either available now or will be very soon):

One of the Greek items in the Early Manuscripts Collection.

Regular readers of Bitstreams are familiar with our digital collections migrations project; we first started writing about it almost 2 years ago when we announced the first collection to be launched in the new Duke Digital Repository interface.  Since then we have posted about various aspects of the migration with some regularity.

What we hoped would be a speedy transition is still a work in progress 2 years later.   This is due to a variety of factors one of which is that the work itself is very complex.  Before we can move a collection into the digital repository it has to be reviewed, all digital objects fully accounted for, and all metadata remediated and crosswalked into the DDR metadata profile.  Sometimes this process requires little effort.   However other times, especially with older collection, we have items with no metadata, or metadata with no items, or the numbers in our various systems simply do not match.  Tracking down the answers can require some major detective work on the part of my amazing colleagues.

Despite these challenges, we eagerly press on.  As each collection moves we get a little closer to having all of our digital collections under preservation control and providing access to all of them from a single platform.  Onward!

A Summer Day in the Life of Digital Collections

A recent tweet from my colleague in the Rubenstein Library (pictured above) pretty much sums up the last few weeks at work.  Although I rarely work directly with students and classes, I am still impacted by the hustle and bustle in the library when classes are in session.  Throughout the busy Spring I found myself saying, oh I’ll have time to work on that over the Summer.  Now Summer is here, so it is time to make some progress on those delayed projects while keeping others moving forward.  With that in mind here is your late Spring and early Summer round-up of Digital Collections news and updates.

Radio Haiti

A preview of the soon to be live Radio Haiti Archive digital collection.

The long anticipated launch of the Radio Haiti Archives is upon us.  After many meetings to review the metadata profile, discuss modeling relationships between recordings, and find a pragmatic approach to representing metadata in 3 languages all in the Duke Digital Repository public interface, we are now in preview mode, and it is thrilling.  Behind the scenes, Radio Haiti represents a huge step forward in the Duke Digital Repository’s ability to store and play back audio and video files.

You can already listen to many recordings via the Radio Haiti collection guide, and we will share the digital collection with the world in late June or early July.  In the meantime, check out this teaser image of the homepage.

 

Section A

My colleague Meghan recently wrote about our ambitions Section A digitization project, which will result in creating finding aids for and digitizing 3000+ small manuscript collections from the Rubenstein library.  This past week the 12 people involved in the project met to review our workflow.  Although we are trying to take a mass digitization and streamlined approach to this project, there are still a lot of people and steps.  For example, we spent about 20-30 minutes of our 90 minute meeting reviewing the various status codes we use on our giant Google spreadsheet and when to update them. I’ve also created a 6 page project plan that encompasses both a high and medium level view of the project. In addition to that document, each part of the process (appraisal, cataloging review, digitization, etc.) also has their own more detailed documentation.  This project is going to last at least a few years, so taking the time to document every step is essential, as is agreeing on status codes and how to use them.  It is a big process, but with every box the project gets a little easier.

Status codes for tracking our evaluation, remediation, and digitization workflow.
Section A Project Plan Summary

 

 

 

 

 

 

 

Diversity and Inclusion Digitization Initiative Proposals and Easy Projects

As Bitstreams readers and DUL colleagues know, this year we instituted 2 new processes for proposing digitization projects.  Our second digitization initiative deadline has just passed (it was June 15) and I will be working with the review committee to review new proposals as well as reevaluate 2 proposals from the first round in June and early July.  I’m excited to say that we have already approved one project outright (Emma Goldman papers), and plan to announce more approved projects later this Summer. 

We also codified “easy project” guidelines and have received several easy project proposals.  It is still too soon to really assess this process, but so far the process is going well.

Transcription and Closed Captioning

Speaking of A/V developments, another large project planned for this Summer is to begin codifying our captioning and transcription practices.  Duke Libraries has had a mandate to create transcriptions and closed captions for newly digitized A/V for over a year. In that time we have been working with vendors on selected projects.  Our next steps will serve two fronts; on the programmatic side we need  review the time and expense captioning efforts have incurred so far and see how we can scale our efforts to our backlog of publicly accessible A/V.  On the technology side I’ve partnered with one of our amazing developers to sketch out a multi-phase plan for storing and providing access to captions and time-coded transcriptions accessible and searchable in our user interface.  The first phase goes into development this Summer.  All of these efforts will no doubt be the subject of a future blog post.  

Testing VTT captions of Duke Chapel Recordings in JWPlayer

Summer of Documentation

My aspirational Summer project this year is to update digital collections project tracking documentation, review/consolidate/replace/trash existing digital collections documentation and work with the Digital Production Center to create a DPC manual.  Admittedly writing and reviewing documentation is not the most exciting Summer plan,  but with so many projects and collaborators in the air, this documentation is essential to our productivity, communication practices, and my personal sanity.   

Late Spring Collection launches and Migrations

Over the past few months we launched several new digital collections as well as completed the migration of a number of collections from our old platform into the Duke Digital Repository.  

New Collections:

Migrated Collections:

…And so Much More!

In addition to the projects above, we continue to make slow and steady progress on our MSI system, are exploring using the FFv1 format for preserving selected moving image collections, planning the next phase of the Digital Collections migration into the Duke Digital Repository, thinking deeply about collection level metadata and structured metadata, planning to launch newly digitized Gedney images, integrating digital objects in finding aids and more.  No doubt some of these efforts will appear in subsequent Bitstreams posts.  In the meantime, let’s all try not to let this Summer fly by too quickly!

Enjoy Summer while you can!