Audiovisual materials account for a significant portion of Duke’s Digital Collections. All told, we now have over 3,400 hours of A/V content accessible online, spread over 14,000 audio and video files discoverable in various platforms. We’ve made several strides in recent years introducing impactful collections of recordings like H. Lee Waters Films, the Jazz Loft Project Records, and Behind the Veil: Documenting African American Life in the Jim Crow South. This spring, the Duke Chapel Recordings collection (including over 1,400 recordings) became our first A/V collection developed in the emerging Duke Digital Repository platform. Completing this first phase of the collection required some initial development for A/V interfaces, and it’ll keep us on our toes to do more as the project progresses through 2019.
Preparing A/V for Access Online
When digitizing audio or video, our diligent Digital Production Center staff create a master file for digital preservation, and from that, a single derivative copy that’s smaller and appropriately compressed for public consumption on the web. The derivative files we create are compressed enough that they can be reliably pseudo-streamed (a.k.a. “progressive download”) to a user over HTTP in chunks (“byte ranges”) as they watch or listen. We are not currently using a streaming media server.
Here’s what’s typical for these files:
- Audio. MP3 format, 128kbps bitrate. ~1MB/minute.
- Video. MPEG4 (.mp4) wrapper files. ~17MB/minute or 1GB/hour.
The video track is encoded as H.264 at about 2,300 kbps; 640×480 for standard 4:3.
The audio track is AAC-encoded at 160kbps.
These specs are also consistent with what we request of external vendors in cases where we outsource digitization.
The A/V Player Interface: JWPlayer
Since 2014, we have used a local instance of JWPlayer as our A/V player of choice for digital collections. JWPlayer bills itself as “The Most Popular Video Player & Platform on the Web.” It plays media directly in the browser by using standard HTML5 video specifications (supported for most intents & purposes now by all modern browsers).
We like JWPlayer because it’s well-documented, and easy to customize with a robust Javascript API to hook into it. Its developers do a nice job tracking browser support for all HTML5 video features, and they design their software with smart fallbacks to look and function consistently no matter what combo of browser & OS a user might have.
In the Duke Digital Repository and our archival finding aids, we’re now using the latest version of JWPlayer. It’s got a modern, flat aesthetic and is styled to match our color palette.
Playlists
Here’s an area where we extended the new JWPlayer with some local development to enhance the UI. When we have a playlist—that is, a recording that is made up of more than one MP3 or MP4 file—we wanted a clearer way for users to navigate between the files than what comes out of the box. It was fairly easy to create some navigational links under the player that indicate how many files are in the playlist and which is currently playing.
Captions & Transcripts
Work is now underway (by three students in the Duke Divinity School) to create timed transcripts of all the sermons given within the recorded services included in the Duke Chapel Recordings project.
We contracted through Popup Archive for computer-generated transcripts as a starting point. Those are about 80% accurate, but Popup provides a really nice interface for editing and refining the automated text before exporting it to its ultimate destination.
One of the most interesting aspects of HTML5 <video> is the <track> element, wherein you can associate as many files of captions, subtitles, descriptions, or chapter information as needed. Track files are encoded as WebVTT; so we’ll use WebVTT files for the transcripts once complete. We’ll also likely capture the start of a sermon within a recording as a WebVTT chapter marker to provide easier navigation to the part of the recording that’s the most likely point of interest.
JWPlayer displays WebVTT captions (and chapter markers, too!). The captions will be wonderful for accessibility (especially for people with hearing disabilities); they can be toggled on/off within the media player window. We’ll also be able to use the captions to display an interactive searchable transcript on the page near the player (see this example using Javascript to parse the WebVTT). Our friends at NCSU Libraries have also shared some great work parsing WebVTT (using Ruby) for interactive transcripts.
The Future
We have a few years until the completion of the Duke Chapel Recordings project. Along the way, we expect to:
- add closed captions to the A/V
- create an interactive transcript viewer from the captions
- work those captions back into the index to aid discovery
- add a still-image extract from each video to use as a thumbnail and “poster frame” image
- offer up much more A/V content in the Duke Digital Repository
Stay tuned!
One of the most interesting aspects of HTML5 is the element, wherein you can associate as many files of captions, subtitles, descriptions, or chapter information as needed. Track files are encoded as WebVTT; so we’ll use WebVTT files for the transcripts once complete.