Category Archives: Digital Collections

FFV1: The Gains of Lossless

One of the greatest challenges to digitizing analog moving-image sources such as videotape and film reels isn’t the actual digitization. It’s the enormous file sizes that result, and the high costs associated with storing and maintaining those files for long-term preservation. For many years, Duke Libraries has generated 10-bit uncompressed preservation master files when digitizing our vast inventory of analog videotapes.

Unfortunately, one hour of uncompressed video can produce a 100 gigabyte file. That’s at least 50 times larger than an audio preservation file of the same duration, and about 1000 times larger than most still image preservation files. That’s a lot of data, and as we digitize more and more moving-image material over time, the long-term storage costs for these files can grow exponentially.

To help offset this challenge, Duke Libraries has recently implemented the FFV1 video codec as its primary format for moving image preservation. FFV1 was first created as part of the open-source FFmpeg software project, and has been developed, updated and improved by various contributors in the Association of Moving Image Archivists (AMIA) community.

FFV1 enables lossless compression of moving-image content. Just like uncompressed video, FFV1 delivers the highest possible image resolution, color quality and sharpness, while avoiding the motion compensation and compression artifacts that can occur with “lossy” compression. Yet, FFV1 produces a file that is, on average, 1/3 the size of its uncompressed counterpart.

sleeping bag
FFV1 produces a file that is, on average, 1/3 the size of its uncompressed counterpart. Yet, the audio & video content is identical, thanks to lossless compression.

The algorithms used in lossless compression are complex, but if you’ve ever prepared for a fall backpacking trip, and tightly rolled your fluffy goose-down sleeping bag into one of those nifty little stuff-sacks, essentially squeezing all the air out of it, you just employed (a simplified version of) lossless compression. After you set up your tent, and unpack your sleeping bag, it decompresses, and the sleeping bag is now physically identical to the way it was before you packed.

Yet, during the trek to the campsite, it took up a lot less room in your backpack, just like FFV1 files take up a lot less room in our digital repository. Like that sleeping bag, FFV1 lossless compression ensures that the compressed video file is mathematically identical to it’s pre-compressed state. No data is “lost” or irreversibly altered in the process.

Duke Libraries’ Digital Production Center utilizes a pair of 6-foot-tall video racks, which house a current total of eight videotape decks, comprised of a variety of obsolete formats such as U-matic (NTSC), U-matic (PAL), Betacam, DigiBeta, VHS (NTSC) and VHS (PAL, Secam). Each deck is converted from analog to digital (SDI) using Blackmagic Design Mini Converters.

The SDI signals are sent to a Blackmagic Design Smart Videohub, which is the central routing center for the entire system. Audio mixers and video transcoders allow the Digitization Specialist to tweak the analog signals so the waveform, vectorscope and decibel levels meet broadcast standards and the digitized video is faithful to its analog source. The output is then routed to one of two Retina 5K iMacs via Blackmagic UltraStudio devices, which convert the SDI signal to Thunderbolt 3.

FFV1 video digitization in progress in the Digital Production Center.

Because no major company (Apple, Microsoft, Adobe, Blackmagic, etc.) has yet adopted the FFV1 codec, multiple foundational layers of mostly open-source systems software had to be installed, tested and tweaked on our iMacs to make FFV1 work: Apple’s Xcode, Homebrew, AMIA’s vrecord, FFmpeg, Hex Fiend, AMIA’s ffmprovisr, GitHub Desktop, MediaInfo, and QCTools.

FFV1 operates via terminal command line prompts, so some understanding of programming language is helpful to enter the correct prompts, and be able to decipher the terminal logs.

The FFV1 files are “wrapped” in the open source Matroska (.mkv) media container. Our FFV1 scripts employ several degrees of quality-control checks, input logs and checksums, which ensure file integrity. The files can then be viewed using VLC media player, for Mac and Windows. Finally, we make an H.264 (.mp4) access derivative from the FFV1 preservation master, which can be sent to patrons, or published via Duke’s Digital Collections Repository.

An added bonus is that, not only can Duke Libraries digitize analog videotapes and film reels in FFV1, we can also utilize the codec (via scripting) to target a large batch of uncompressed video files (that were digitized from analog sources years ago) and make much smaller FFV1 copies, that are mathematically lossless. The script runs checksums on both the original uncompressed video file, and its new FFV1 counterpart, and verifies the content inside each container is identical.

Now, a digital collection of uncompressed masters that took up 9 terabytes can be deleted, and the newly-generated batch of FFV1 files, which only takes up 3 terabytes, are the new preservation masters for that collection. But no data has been lost, and the content is identical. Just like that goose-down sleeping bag, this helps the Duke University budget managers sleep better at night.

We’re hiring!

The Digital Production Center (DPC) is looking to hire a Digitization Specialist to join our team! The DPC team is on the forefront of enabling students, teachers, and researchers to continue their research by digitizing materials from our library collections.  We get to work with a variety of unique and rare materials (in a multitude of formats), and we use professional equipment to get the work done. Imagine working on digitizing papyri and comic books – the spectrum is far and wide! Get a glimpse of the collections that have been digitized by DPC staff by checking out our Duke Digital Collections.

Also, the people are really nice (and right now, we’re working in a socially distanced manner)!

More information about the job description can be found here. The successful candidate should be detailed-oriented, possess excellent organizational, project management skills, have scanning experience, and be able to work independently and effectively in a team environment. This position is part of the Digital Collections and Curation Services department and will report to the Digital Production Services manager.

More information about Duke’s benefit package can be found at https://hr.duke.edu/benefits. For more information and to apply, please submit an electronic resume, cover letter, and a list of 3 references to https://library.duke.edu/about/jobs/digitizationspecialist. Review of applications will begin immediately and will continue until the position is filled.

Access for One, Access for All: DPC’s Approach towards Folder Level Digitization

Earlier this year and prior to the pandemic, Digital Production Center (DPC) staff piloted an alternative approach to digitize patron requests with the Rubenstein Library’s Research Services (RLRS) team. The previous approach was focused on digitizing specific items that instruction librarians and patrons requested, and these items were delivered directly to that person. The alternative strategy, the Folder Level digitization approach, involves digitizing the contents of the entire folder that the item is contained in, ingesting these materials to the Duke Digital Repository (to enable Duke Library staff to retrieve these items), and when possible, publishing these materials so that they are available to anyone with internet access. This soft launch prepared us for what is now an all-hands-on-deck-but-in-a-socially-distant-manner digitization workflow.

Giao Luong Baker assessing folders in the DPC.

Since returning to campus for onsite digitization in late June, the DPC’s primary focus has been to perfect and ramp up this new workflow. It is important to note that the term “folder” in this case is more of a concept and that its contents and their conditions vary widely. Some folders may have 2 pages, other folders have over 300 pages. Some folders consists of pamphlets, notebooks, maps, papyri, and bound items. All this to say that a “folder” is a relatively loose term.

Like many initiatives at Duke Libraries, Folder Level Digitization is not just a DPC operation, it is a collaborative effort. This effort includes RLRS working with instructors and patrons to identify and retrieve the materials. RLRS also works with Rubenstein Library Technical Services (RLTS) to create starter digitization guides, which are the building blocks for our digitization guide. Lastly, RLRS vets the materials and determines their level of access. When necessary, Duke Library’s Conservation team steps in to prepare materials for digitization. After the materials are digitized, ingest and metadata work by the Digital Collections and Curation Services as well as the RLTS teams ensure that the materials are preserved and available in our systems.

Kristin Phelps captures a color target.

Doing this work in the midst of a pandemic requires that DPC work closely with the Rubenstein Library Access Services Reproduction Team (a section of RLRS) to track our workflow using a Google Doc. We track the point where the materials are identified by RLRS, through multiple quarantine periods, scanning, post processing, file delivery, to ingest. Also, DPC staff are digitizing in a manner that is consistent with COVID-19 guidelines. Materials are quarantined before and after they arrive at the DPC, machines and workspaces are cleaned before and after use, capture is done in separate rooms, and quality control is done off site with specialized calibrated monitors.

Since we started Folder Level digitization, the DPC has received close to 200 unique Instruction and Patron requests from RLRS. As of the publication of this post, 207 individual folders (an individual request may contain several folders) have been digitized. In total, we’ve scanned and quality controlled over 26,000 images since we returned to campus!

By digitizing entire folders, we hope this will allow for increased access to the materials without risking damage through their physical handling. So far we anticipate that 80 new digital collections will be ingested to the Duke Digital Repository. This number will only grow as we receive more requests. Folder Level Digitization is an exciting approach towards digital collection development, as it is directly responsive to instruction and researcher needs. With this approach, it is access for one, access for all!

Hope Harvested

This began as a quest for images of people engaging in recreational activities. Facing copious time indoors with limited places to go, many are looking for respite. I thought it would be uplifting to find pictures of people having fun. While combing through Duke University Libraries’ numerous digital collections in search of such images, several photos caught my eye. I clicked through hundreds of images reading their captions and summaries. Driven to delve deeper into collections for the story behind those smiling faces. As I sought these stories, I recalled the words of James Baldwin:

You think your pain and your heartbreak are unprecedented in the history of the world, but then you read.

Here were the lived experiences of people striving, aspiring, and persevering.

What started as a search for people pursuing pastimes quickly pivoted. It transformed into a search for people – smiling, laughing and hoping despite their circumstances. Presented below is a small harvest of photographs that inspired this post, including embedded links to their collections. As they did for me, I hope these photos may serve as a gateway to explore these inspired collections.

This image is from a series of photographs taken by James Karales between 1953 and 1957 in Rendville, Ohio, a small mining town which was one of the first racially integrated towns in the U.S.

 

African would-be immigrants play soccer in an enclosed compound at the Safi detention centre outside Valletta July 15, 2008. Around 1,500 illegal immigrants are currently held in detention in Malta for periods of up to 18 months. Though their intention was to reach Italy, most found themselves in Malta when they were rescued by the Maltese Armed Forces when they found themselves in difficulties while on their way to reach European soil from Africa.

 

Men eating at cooperative farm, central Cuba

Casting a Critical Eye on the Hayti-Elizabeth Street Renewal Area Maps

In 2019, one of the digital collections we made available to the public was a small set of architectural maps and plans titled the ‘Hayti-Elizabeth Street Renewal Area’. The short description of the maps indicates they ‘depict existing and proposed structures and modifications to the Hayti neighborhood in Durham, NC.’ Sounds pretty benign, right? Perhaps even kind of hopeful, given the word ‘renewal’?

Hayti-Elizabeth Street Renewal Area, Existing Land Use Map

Nope. This anodyne description does not tell the story of the harm caused by the Durham Urban Renewal project of the 1960s and 1970s. The Durham Redevelopment Commission intended to eliminate ‘urban blight’ via this project, which ultimately resulted in the destruction of more than 4,000 households and 500 businesses in predominantly African American areas of the city. The Hayti District, once a flourishing and self-sufficient neighborhood filled with Black-owned businesses, was largely demolished, divided, and effectively severed from what is now downtown Durham by the construction of NC Highway 147. 

Bull City 150, a “public history, geography and community engagement project” based here at Duke University, hosts a suite of excellent multi-media public history exhibitions about housing inequality in Durham on its website. One of these is Dismantling Hayti, which focuses in particular on the effects of urban renewal on the neighborhood and the city.

Dismantling Hayti, Bull City 150

But this story of so-called urban renewal is not just about Durham – it’s about the United States as a whole. From the 1950s to the 1980s, municipalities across the country demolished roughly 7.5 million dwelling units, with a vastly disproportionate impact on Black and low-income neighborhoods, in the name of revitalization. Bulldozing for highway corridors was frequently a part of urban renewal projects, happening in San Francisco, Memphis, Boston, Atlanta, Syracuse, Baltimore, everywhere in the country – the list goes on and on. And it includes Saint Paul, Minnesota, the city where, mourning and protesting the killing of yet another Black person at the hands of a white police officer, thousands of people occupied Interstate 94 in recent weeks, marching from the state capitol to Minneapolis, over a highway that was once the African American neighborhood of Rondo.  

Urban renewal projects led to what social psychiatrist Dr. Mindy Fulilove refers to as root shock – “a traumatic stress reaction related to the destruction of one’s emotional ecosystem”. This is but one thread in the fabric of white supremacy out of which our country was woven, among other twentieth century practices of redlining, discriminatory mortgage lending practices, denial of access to unemployment benefits, and rampant Jim Crow laws, which are still causing harm today. This is why it is important to interrogate the historical context of resources like the Hayti-Elizabeth Street Renewal Area maps – we should all accept the invitation extended on the Bull City 150 website to Durhamites to “reckon with the racial and economic injustices of the past 150 years and commit to building a more equitable future”.

ArcLight Migration: A Status Update After Three Months of Work

On January 20, 2020, we kicked off our first development sprint for implementing ArcLight at Duke as our new finding aids / collection guides platform. We thought our project charter was solid: thorough, well-vetted, with a reasonable set of goals. In the plan was a roadmap identifying a July 1, 2020 launch date and a list of nineteen high-level requirements. There was nary a hint of an impending global pandemic that could upend absolutely everything.

The work wasn’t supposed to look like this, carried out by zooming virtually into each other’s living rooms every day. Code sessions and meetings now require navigating around child supervision shifts and schooling-from-home responsibilities. Our new young office-mates occasionally dance into view or within earshot during our calls. Still, we acknowledge and are grateful for the privilege afforded by this profession to continue to do our work remotely from safe distance.

So, a major shoutout is due to my colleagues in the trenches of this work overcoming the new unforeseen constraints around it, especially Noah Huffman, David Chandek-Stark, and Michael Daul. Our progress to date has only been possible through resilience, collaboration, and willingness to keep pushing ahead together.

Three months after we started the project, we remain on track for a summer 2020 launch.

As a reminder, we began with the core open-source ArcLight platform (demo available) and have been building extensions and modifications in our local application in order to accommodate Duke needs and preferences. With the caveat that there’ll be more changes coming over the next couple months before launch, I want to provide a summary of what we have been able to accomplish so far and some issues we have encountered along the way. Duke staff may access our demo app (IP-restricted) for an up-to-date look at our work in progress.

Homepage

Homepage design for Duke’s ArcLight finding aids site.
  • Duke Branding. Aimed to make an inviting front door to the finding aids consistent with other modern Duke interfaces, similar to–yet distinguished enough from–other resources like the catalog, digital collections, or Rubenstein Library website.
  • Featured Items. Built a configurable set of featured items from the collections (with captions), to be displayed randomly (actual selections still in progress).
  • Dynamic Content. Provided a live count of collections; we might add more indicators for types/counts of materials represented.

Layout

A collection homepage with a sidebar for context navigation.
  • Sidebar. Replaced the single-column tabbed layout with a sidebar + main content area.
  • Persistent Collection Info. Made collection & component views more consistent; kept collection links (Summary, Background, etc.) visible/available from component pages.
  • Width. Widened the largest breakpoint. We wanted to make full use of the screen real estate, especially to make room for potentially lengthy sidebar text.

Navigation

Component pages contextualized through a sidebar navigator and breadcrumb above the main title.
  • Hierarchical Navigation. Restyled & moved the hierarchical tree navigation into the sidebar. This worked well functionally in ArcLight core, but we felt it would be more effective as a navigational aid when presented beside rather than below the content.
  • Tooltips & Popovers. Provided some additional context on mouseovers for some navigational elements.

    Mouseover context in navigation.
  • List Child Components. Added a direct-child list in the main content for any series or other component. This makes for a clear navigable table of what’s in the current series / folder / etc. Paginating it helps with performance in cases where we might have 1,000+ sibling components to load.
  • Breadcrumb Refactor. Emphasized the collection title. Kept some indentation, but aimed for page alignment/legibility plus a balance of emphasis between current component title and collection title.

    Breadcrumb trail to show the current component’s nesting.

Search Results

Search results grouped by collection, with keyword highlighting.
  • “Group by Collection” as the default. Our stakeholders were confused by atomized components as search results outside of the context of their collections, so we tried to emphasize that context in the default search.
  • Revised search result display. Added keyword highlighting within result titles in Grouped or All view. Made Grouped results display checkboxes for bookmarking & digitized content indicators.
  • Advanced Search. Kept the global search box simple but added a modal Advanced search option that adds fielded search and some additional filters.

Digital Objects Integration

Digital objects from the Duke Digital Repository are presented inline in the finding aid component page.
  • DAO Roles. Indexed the @role attribute for <dao> elements; we used that to call templates for different kinds of digital content
  • Embedded Object Viewers. Used the Duke Digital Repository’s embed feature, which renders <iframe>s for images and AV.

Indexing

  • Whitespace compression. Added a step to the pipeline to remove extra whitespace before indexing. This seems to have slightly accelerated our time-to-index rather than slow it down.
  • More text, fewer strings. We encountered cases where note-like fields indexed as strings by ArcLight core (e.g., <scopecontent>) needed to be converted to text because we had more than 32,766 bytes of data (limit for strings) to put in them. In those cases, finding aids were failing to index.
  • Underscores. For the IDs that end up in a URL for a component, we added an underscore between the finding aid slug and the component ID. We felt these URLs would look cleaner and be better for SEO (our slugs often contain names).
  • Dates. Changed the date normalization rules (some dates were being omitted from indexing/display)
  • Bibliographic ID. We succeeded in indexing our bibliographic IDs from our EADs to power a collection-level Request button that leads a user to our homegrown requests system.

Formatting

  • EAD -> HTML. We extended the EAD-to-HTML transformation rules for formatted elements to cover more cases (e.g., links like <extptr> & <extref> or other elements like <archref> & <indexentry>)

    Additional formatting and link render rules applied.
  • Formatting in Titles. We preserved bold or italic formatting in component titles.

ArcLight Core Contributions

  • We have been able to contribute some of our code back to the ArcLight core project to help out other adopters.

Setting the Stage

The behind-the-scenes foundational work deserves mention here — it represents some of the most complex and challenging aspects of the project.  It makes the application development driving the changes I’ve shared above possible.

  • Built separate code repositories for our Duke ArcLight application and our EAD data
  • Gathered a diverse set of 40 representative sample EADs for testing
  • Dockerized our Duke ArcLight app to simplify developer environment setup
  • Provisioned a development/demo server for sharing progress with stakeholders
  • Automated continuous integration and deployment to servers using GitLabCI
  • Performed targeted data cleanup
  • Successfully got all 4,000 of our finding aids indexed in Solr on our demo server

Our team has accomplished a lot in three months, in large part due to the solid foundation the ArcLight core software provides. We’re benefiting from some amazing work done by many, many developers who have contributed their expertise and their code to the Blacklight and ArcLight codebases over the years. It has been a real pleasure to be able to build upon an open source engine– a notable contrast to our previous practice of developing everything in-house for finding aids discovery and access.

Still, much remains to be addressed before we can launch this summer.

The Road Ahead

Here’s a list of big things we still plan to tackle by July (other minor revisions/bugfixes will continue as well)…

  • ASpace -> ArcLight. We need a smoother publication pipeline to regularly get data from ArchivesSpace indexed into ArcLight.
  • Access & Use Statements. We need to revise the existing inheritance rules and make sure these statements are presented clearly. It’s especially important when materials are indeed restricted.
  • Relevance Ranking. We know we need to improve the ranking algorithm to ensure the most relevant results for a query appear first.
  • Analytics. We’ll set up some anonymized tracking to help monitor usage patterns and guide future design decisions.
  • Sitemap/SEO. It remains important that Google and other crawlers index the finding aids so they are discoverable via the open web.
  • Accessibility Testing / Optimization. We aim to comply with WCAG2.0 AA guidelines.
  • Single-Page View. Many of our stakeholders are accustomed to a single-page view of finding aids. There’s no such functionality baked into ArcLight, as its component-by-component views prioritize performance. We might end up providing a downloadable PDF document to meet this need.
  • More Data Cleanup. ArcLight’s feature set (especially around search/browse) reveals more places where we have suboptimal or inconsistent data lurking in our EADs.
  • More Community Contributions. We plan to submit more of our enhancements and bugfixes for consideration to be merged into the core ArcLight software.

If you’re a member of the Duke community, we encourage you to explore our demo and provide feedback. To our fellow future ArcLight adopters, we would love to hear how your implementations or plans are shaping up, and identify any ways we might work together toward common goals.

Stay safe, everyone!

Labor in the Time of Coronavirus

The Coronavirus pandemic has me thinking about labor–as a concept, a social process, a political constituency, and the driving force of our economy–in a way that I haven’t in my lifetime. It’s become alarmingly clear (as if it wasn’t before) that we all need food, supplies, and services to survive past next week, and that there are real human beings out there working to produce and deliver these things. No amount of entrepreneurship, innovation, or financial sleight of hand will help us through the coming months if people are not working to provide the basic requirements for life as we know it.

This blog post draws from  images in our digitized library collections to pay tribute to all of the essential workers who are keeping us afloat during these challenging times. As I browsed these photographs and mused on our current situation, a few important and oft-overlooked questions came to mind.

Who grows our food? Where does it come from and how is it processed? How does it get to us?

Bell pepper pickers, 1984 June. Paul Kwilecki Photographs.
Prisoners at the county farm killing hogs, 1983 Mar. Paul Kwilecki Photographs.
Worker atop rail car loading corn from storage tanks, 1991 Sept. Paul Kwilecki Photographs.
Manuel Molina, mushroom farm worker, Kennett Square, PA 1981. Frank Espada Photographs.

What kind of physical environment do we work in and how does that affect us?

Maids in a room of the Stephen Decatur Hotel shortly before it was torn down, 1970. Paul Kwilecki Photographs
Worker in pit preparing to weld. Southeastern Minerals Co. Bainbridge, 1991 Aug. Paul Kwilecki Photographs
Loggers in the woods near Attapulgus, 1978 Feb. Paul Kwilecki Photographs.

How do we interact with machines and technology in our work? Can our labor be automated or performed remotely?

Worker signaling for more logs. Elberta Crate and Box. Co. Bainbridge, 1991 Sept. Paul Kwilecki Photographs.
Worker, Williamson-Dickie plant. Bainbridge, 1991 Sept. Paul Kwilecki Photographs.
Machine operator watching computer controlled lathe, 1991 July. Paul Kwilecki Photographs.
Worker at her machine. Elberta Crate and Box Co. Bainbridge, 1981 Nov. Paul Kwilecki Photographs.

What equipment and clothing do we need to work safely and productively?

Cotton gin worker wearing safety glass and ear plugs for noise protection. Decatur Gin Co., 1991 Sept. Paul Kwilecki Photographs.
Worker operating bagging machine. Flint River Mills. Bainbridge, 1991 July. Paul Kwilecki Photographs.
Worker at State Dock, 1992 July. Paul Kwilecki Photographs.

Are we paid fairly for our work? How do relative wages for different types of work reflect what is valued in our society?

No known title. William Gedney Photographs.

How we think about and respond to these questions will inform how we navigate the aftermath of this ongoing crisis and whether or not we thrive into the future. As we celebrate International Workers’ Day on May 1 and beyond, I hope everyone will take some time to think about what labor means to them and to our society as a whole.

Beyond One Thousand Words

There is a particular fondness that I hold for digital photograph collections. If I had to pinpoint when this began, then I would have to say it started while digitizing material on a simple Epson flatbed scanner as an undergraduate student worker in the archives.

Witnessing the physical become digital is a wonder that never gets old.

Every day we are generating digital content. Pet pics. Food pics. Selfies. Gradually building a collection of experiences as we document our lives in images. Sporadic born digital collections stored on devices and in the cloud.

I do not remember the last time I printed a photograph.

My parents have photo albums that I love. Seeing images of them, then us. The tacky adhesive and the crinkle of thin plastic film as it is pulled back to lift out a photo. That perfect square imprint left behind from where the photo rested on the page.

Pretty sure that Polaroid camera is still around somewhere.

Time bound up in a book.

Beyond their visual appeal, I appreciate how photos capture time. Nine months have passed since I moved to North Carolina. I started 2019 in Chicago and ended it in Durham. These photos of my Winter in both places illustrate that change well.

Sometimes I want to pull down my photos from the cloud and just print everything. Make my own album. Have something with heft and weight to share and say, “Hey, hold and look at this.” That sensory experience is invaluable.

Yet, I also value the convenience of being able to view hundreds of photos with the touch of a button.

Duke University Libraries offers access to thousands of images through its Digital Collections.

Here’s a couple photo collections to get you started:

All About that Time Base

The video digitization system in Duke Libraries’ Digital Production Center utilizes many different pieces of equipment: power distributors, waveform and vectorscope monitors, analog & digital routers, audio splitters & decibel meters, proc-amps, analog (BNC, XLR and RCA) to digital (SDI) converters, CRT & LCD video monitors, and of course an array of analog video playback decks of varying flavors (U-matic-NTSC, U-matic-PAL, Betacam SP, DigiBeta, VHS-NTSC and VHS-PAL/SECAM). We also transfer content directly from born-digital DV and MiniDV tapes.

A grandfather clock is a time base.

One additional component that is crucial to videotape digitization is the Time Base Corrector (TBC). Each of our analog video playback decks must have either an internal or external TBC, in order to generate an image of acceptable quality. At the recent Association of Moving Image Archivist’s Conference in Baltimore, George Blood (of George Blood Audio/Video/Film/Data) gave a great presentation on exactly what a Time Base Corrector is, appropriately entitled “WTF is a TBC?” Thanks to George for letting me relay some of his presentation points here.

A time base is a consistent reference point that one can utilize to stay in sync. For example, The Earth rotating around the Sun is a time base that the entire human race relies on, to stay on schedule. A grandfather clock is also a time base. And so is a metronome, which a musical ensemble might use to all stay “in time.”

Frequency is defined as the number of occurrences of a repeating event per unit of time. So, the frequency of the Earth rotating around the Sun is once per 24 hrs. The frequency of a grandfather clock is one pendulum swing per second. The clock example can also be defined as one “cycle per second” or one hertz (Hz), named after Heinrich Hertz, who first conclusively proved the existence of electromagnetic waves in the late 1800’s.

One of the DPC’s external Time Base Correctors

But anything mechanical, like grandfather clocks and videotape decks, can be inconsistent. The age and condition of gears and rods and springs, as well as temperature and humidity, can significantly affect a grandfather clock’s ability to display the time correctly.

Videotape decks are similar, full of numerous mechanical and electrical parts that produce infinite variables in performance, affecting the deck’s ability to play the videotape’s frames-per-second (frequency) in correct time.

NTSC video is supposed to play at 29.97 frames-per-second, but due to mechanical and electro-magnetic variables, some frames may be delayed, or some may come too fast. One second of video might not have enough frames, another second may have too many. Even the videotape itself can stretch, expand and contract during playback, throwing off the timing, and making the image wobbly, jittery, too bright or dark, too blue, red or green.

A Time Base Corrector does something awesome. As the videotape plays, the TBC stores the unstable video content briefly, fixes the timing errors, and then outputs the corrected analog video signal to the DPC’s analog-to-digital converters. Some of our videotape decks have internal TBCs, which look like a computer circuit board (shown below). Others need an external TBC, which is a smaller box that attaches to the output cables coming from the videotape deck (shown above, right). Either way, the TBC can delay or advance the video frames to lock them into correct time, which fixes all the errors.

An internal Time Base Corrector card from a Sony U-matic BVU-950 deck

An internal TBC is actually able to “talk” to the videotape deck, and give it instructions, like this…

“Could you slow down a little? You’re starting to catch up with me.”

“Hey, the frames are arriving at a strange time. Please adjust the timing between the capstan and the head drum.”

“There’s a wobble in the rate the frames are arriving. Can you counter-wobble the capstan speed to smooth that out?”

“Looks like this tape was recorded with bad heads. Please increase gain on the horizontal sync pulse so I can get a clearer lock.”

Without the mighty TBC, video digitization would not be possible, because all those errors would be permanently embedded in the digitized file. Thanks to the TBC, we can capture a nice, clean, stable image to share with generations to come, long after the magnetic videotape, and playback decks, have reached the end of their shelf life.

Digital Collections 2019

‘Tis the time of year for top 10 lists. Here at Duke Digital Collections HQ, we cannot just pick 10, because all our digital collections are tops!  What follows is a list of all the digital collections we have launched for public access this calendar year.

Our newest collections include a range of formats and subject areas from 19th Century manuscripts to African American soldiers photograph albums to Duke Mens Basketball posters to our first Multispectral Images of papyrus to be ingested into the repository.  We also added new content to 4 existing digital collections.  Lastly, our platform migration is still ongoing, but we made some incredible progress this year as you will see below.  Our goal is to finish the migration by the end of 2020.

New Digital Collections

Additions to Existing Collections

Migrated Collections into the Duke Digital Repository