Category Archives: Digital Collections

Respectfully Yours: A Deep Dive into Digitizing the Booker T. Washington Collection

Post authored by Jen Jordan, Digital Collections Intern.

Hello, readers. This marks my third, and final blog as the Digital Collections intern, a position that I began in June of last year.* Over the course of this internship I have been fortunate to gain experience in nearly every step of the digitization and digital collections processes. One of the things I’ve come to appreciate most about the different workflows I’ve learned about is how well they accommodate the variety of collection materials that pass through. This means that when unique cases arise, there is space to consider them. I’d like to describe one such case, involving a pretty remarkable collection. 

Cheyne, C.E. “Booker T. Washington sitting and holding books,” 1903. 2 photographs on 1 mount : gelatin silver print ; sheets 14 x 10 cm. In Washington, D.C., Library of Congress Prints and Photographs Division.

In early October I arrived to work in the Digital Production Center (DPC) and was excited to see the Booker T. Washington correspondence, 1903-1916, 1933 and undated was next up in the queue for digitization. The collection is small, containing mostly letters exchanged between Washington, W. E. B. DuBois, and a host of other prominent leaders in the Black community during the early 1900s. A 2003 article published in Duke Magazine shortly after the Washington collection was donated to the John Hope Franklin Research Center provides a summary of the collection and the events it covers. 

Arranged chronologically, the papers were stacked neatly in a small box, each letter sealed in a protective sleeve, presumably after undergoing extensive conservation treatments to remediate water and mildew damage. As I scanned the pages, I made a note to learn more about the relationship between Washington and DuBois, as well as the events the collection is centered around—the Carnegie Hall Conference and the formation of the short-lived Committee of Twelve for the Advancement of the Interests of the Negro Race. When I did follow up, I was surprised to find that remarkably little has been written about either.

As I’ve mentioned before, there is little time to actually look at materials when we scan them, but the process can reveal broad themes and tone. Many of the names in the letters were unfamiliar to me, but I observed extensive discussion between DuBois and Washington regarding who would be invited to the conference and included in the Committee of Twelve. I later learned that this collection documents what would be the final attempt at collaboration between DuBois and Washington.

Washington to Browne, 21 July 1904, South Weymouth, Massachusetts

Once scanned, the digital surrogates pass through several stages in the DPC before they are prepared for ingest into the Duke Digital Repository (DDR); you can read a comprehensive overview of the DPC digitization workflow here. Fulfilling patron requests is top priority, so after patrons receive the requested materials, it might be some time before the files are submitted for ingest to the DDR. Because of this, I was fortunate to be on the receiving end of the BTW collection in late January. By then I was gaining experience in the actual creation of digital collections—basically everything that happens with the files once the DPC signals that they are ready to move into long term storage. 

There are a few different ways that new digital collections are created. Thus far, most of my experience has been with the files produced through patron requests handled by the DPC. These tend to be smaller in size and have a simple file structure. The files are migrated into the DDR, into either a new or existing collection, after which file counts are checked, and identifiers assigned. The collection is then reviewed by one of a few different folks with RL Technical Services. Noah Huffman conducted the review in this case, after which he asked if we might consider itemizing the collection, given the letter-level descriptive metadata available in the collection guide. 

I’d like to pause for a moment to discuss the tricky nature of “itemness,” and how the meaning can shift between RL and DCCS. If you reference the collection guide linked in the second paragraph, you will see that the BTW collection received item-level description during processing—with each letter constituting an item in the collection. The physical arrangement of the papers does not reflect the itemized intellectual arrangement, as the letters are grouped together in the box they are housed in. When fulfilling patron reproduction requests, itemness is generally dictated by physical arrangement, in what is called the folder-level model; materials housed together are treated as a single unit. So in this case, because the letters were grouped together inside of the box, the box was treated as the folder, or item. If, however, each letter in the box was housed within its own folder, then each folder would be considered an item. To be clear, the papers were housed according to best practices; my intent is simply to describe how the processes between the two departments sometimes diverge.  

Processing archival collections is labor intensive, so it’s increasingly uncommon to see item-level description. Collections can sit unprocessed in “backlog” for many years, and though the depth of that backlog varies by institution, even well-resourced archives confront the problem of backlog. Enter: More Product, Less Process (MPLP), introduced by Mark Greene and Dennis Meissner in a 2005 article as a means to address the growing problem. They called on archivists to prioritize access over meticulous arrangement and description.  

The spirit of folder-level digitization is quite similar to MPLP, as it enables the DPC to provide access to a broader selection of collection materials digitized through patron requests, and it also simplifies the process of putting the materials online for public access. Most of the time, the DPC’s approach to itemness aligns closely with the level of description given during processing of the collection, but the inevitable variance found between archival collections requires a degree of flexibility from those working to provide access to them. Numerous examples of digital collections that received item-level description can be found in the DDR, but those are generally tied to planned efforts to digitize specific collections. 

Because the BTW collection was digitized as an item, the digital files were grouped together in a single folder, which translated to a single landing page in the DDR’s public user interface. Itemizing the collection would give each item/letter its own landing page, with the potential to add unique metadata. Similarly, when users navigate the RL collection guide, embedded digital surrogates appear for each item. A moment ago I described the utility of More Product Less Process. There are times, however, when it seems right to do more. Given the research value of this collection, as well as its relatively small size, the decision to proceed with itemization was unanimous. 

Itemizing the collection was fairly straightforward. Noah shared a spreadsheet with metadata from the collection guide. There were 108 items, with each item’s title containing the sender and recipient of a correspondence, as well as the location and date sent. Given the collection’s chronological physical arrangement, it was fairly simple to work through the files and assign them to new folders. Once that was finished, I selected additional descriptive metadata terms to add to the spreadsheet, in accordance with the DDR Metadata Application Profile. Because there was a known sender and recipient for almost every letter, my goal was to identify any additional name authority records not included in the collection guide. This would provide an additional access point by which to navigate the collection. It would also help me to identify death dates for the creators, which determines copyright status. I think the added time and effort was well worth it.

This isn’t the space for analysis, but I do hope you’re inspired to spend some time with this fascinating collection. Primary source materials offer an important path to understanding history, and this particular collection captures the planning and aftermath of an event that hasn’t received much analysis. There is more coverage of what came after; Washington and DuBois parted ways, after which DuBois became a founding member of the Niagara Movement. Though also short lived, it is considered a precursor to the NAACP, which many members of the Niagara Movement would go on to join. A significant portion of W. E. B. DuBois’s correspondence has been digitized and made available to view through UMass Amherst. It contains many additional letters concerning the Carnegie Conference and Committee of Twelve, offering additional context and perspective, particularly in certain correspondence that were surely not intended for Washington’s eyes. What I found most fascinating, though, was the evidence of less public (and less adversarial) collaboration between the two men. 

The additional review and research required by the itemization and metadata creation was such a fascinating and valuable experience. This is true on a professional level as it offered the opportunity to do something new, but I also felt moved to try to understand more about the cast of characters who appear in this important collection. That endeavor extended far beyond the hours of my internship, and I found myself wondering if this was what the obsessive pursuit of a historian’s work is like. In any case, I am grateful to have learned more, and also reminded that there is so much more work to do.

Click here to view the Booker T. Washington correspondence in the Duke Digital Repository.

*Indeed, this marks my final post in this role, as my internship concludes at the end of April, after which I will move on to a permanent position. Happily, I won’t be going far, as I’ve been selected to remain with DCCS as one of the next Repository Services Analysts!    

Sources

Cheyne, C.E. “Booker T. Washington sitting and holding books,” 1903. 2 photographs on 1 mount : gelatin silver print ; sheets 14 x 10 cm. In Washington, D.C., Library of Congress Prints and Photographs Division. Accessed April 5, 2022. https://www.loc.gov/pictures/item/2004672766/

 

Wars of Aliens, Men, and Women: or, Some Things we Digitized in the DPC this Year

Post authored by Jen Jordan, Digital Collections Intern. 

As another strange year nears its end, I’m going out on a limb to assume that I’m not the only one around here challenged by a lack of focus. With that in mind, I’m going to keep things relatively light (or relatively unfocused) and take you readers on a short tour of items that have passed through the Digital Production Center (DPC) this year. 

Shortly before the arrival of COVID-19, the DPC implemented a folder-level model for digitization. This model was not developed in anticipation of a life-altering pandemic, but it was well-suited to meet the needs of researchers who, for a time, were unable to visit the Rubenstein Library to view materials in person. You can read about the implementation of folder-level digitization and its broader impact here. To summarize, before spring of 2020 it was standard practice to fill patron requests by imaging only the item needed (e.g. – a single page within a folder). Now, the default practice is to digitize the entire folder of materials. This has produced a variety of positive outcomes for stakeholders in the Duke University Libraries and broader research community, but for the purpose of this blog, I’d like to describe my experience interacting with materials in this way.

Digitization is time consuming, so the objective is to move as quickly as possible while maintaining a high level of accuracy. There isn’t much time for meaningful engagement with collection items, but context reveals itself in bits and pieces. Themes rise to the surface when working with large folders of material on a single topic, and sometimes the image on the page demands to be noticed. 

Even while working quickly, one would be hard-pressed to overlook this Vietnam-era anti-war message. One might imagine that was by design. From the Student Activism Reference collection: https://repository.duke.edu/dc/uastuactrc.

On more than one occasion I’ve found myself thinking about the similarities between scanning and browsing a social media app like Instagram. Stick with me here! Broadly speaking, both offer an endless stream of visual stimuli with little opportunity for meaningful engagement in the moment. Social media, when used strategically, can be world-expanding. Work in the DPC has been similarly world-expanding, but instead of an algorithm curating my experience, the information that I encounter on any given day is curated by patron requests for digitization. Also similar to social media is the range of internal responses triggered over the course of a work day, and sometimes in the span of a single minute. Amusement, joy, shock, sorrow—it all comes up.

I started keeping notes on collection materials and topics to revisit on my own time. Sometimes I was motivated by a stray fascination with the subject matter. Other times I encountered collections relating to prominent historical figures or events that I realized I should probably know a bit more about.

 

Image from the WPSU Scrapbook.

First wave feminism was one such topic that revealed itself. It was a movement I knew little about, but the DPC has digitized numerous items relating to women’s suffrage and other feminist issues at the turn of the 20th century. I was particularly intrigued by the radical leanings of the UK’s Women’s Social and Political Union (WSPU), organized by Emmeline Pankhurst to fight for the right to vote. When I started looking at newspaper clippings pasted into a scrapbook documenting WSPU activities, I was initially distracted by the amusing choice of words (“Coronation chair damaged by wild women’s bomb”). Curious to learn more, I went home and read about the WSPU. The following excerpt is from a speech by Pankhurst in which she provides justification for the militant tactics employed by the WSPU:

I want to say here and now that the only justification for violence, the only justification for damage to property, the only justification for risk to the comfort of other human beings is the fact that you have tried all other available means and have failed to secure justice. I tell you that in Great Britain there is no other way…

Pankhurst argued that men had to take the right to vote through war, so why shouldn’t women also resort to violence and destruction? And so they did.

As Rubenstein Library is home to the Sallie Bingham Center, it’s unsurprising that the DPC digitizes a fair amount of material on women’s issues. To share a few more examples, I appreciate the juxtaposition of the following two images, both of which I find funny, and yet sad.

Source collection: Young woman’s scrapbook, 1900-1905 and n.d.

The advertisement to the right is pasted inside a young woman’s scrapbook dated 1900—1905. It contains information on topics such as etiquette, how to manage a household, and how to be a good wife. Are we to gather that proper shade cloth is necessary to keep a man happy?

In contrast, the image below and to the left is from the book L’amour libre by French feminist, Madeleine Vernet, describes prostitution and marriage as the same kind of prison, with “free love” as the only answer. Some might call that a hyperbolic comparison, but after perusing the young woman’s scrapbook, I’m not so sure. I’m just thankful to have been born a woman near the end of the 20th century and not the start of it.

From the book L’amour libre by Madeline Vernet

This may be difficult to believe, but I didn’t set out to write a blog so focused on struggle. The reality, however, is that our special collections are full of struggle. That’s not all there is, of course, but I’m glad this material is preserved. It holds many lessons, some of which we still have yet to learn. 

I think we can all agree that 2021 was, well, a challenging year. I’d be remiss not to close with a common foe we might all rally around. As we move into 2022 and beyond, venturing ever deeper into space, we may encounter this enemy sooner than we imagined…

Image from an illustrated 1906 French translation of H.G. Wells’s ‘War of the Worlds’.

Sources:

Pankhurst, Emmeline. Why We Are Militant: A Speech Delivered by Mrs. Pankhurst in New York, October 21, 1913. London: Women’s Press, 1914. Print.

“‘Prayers for Prisoners’ and church protests.” Historic England, n.d., https://historicengland.org.uk/research/inclusive-heritage/womens-history/suffrage/church-protests/ 

 

Good News from the DPC: Digitization of Behind the Veil Tapes is Underway

This post was written by Jen Jordan, a graduate student at Simmons University studying Library Science with a concentration in Archives Management. She is the Digital Collections intern with the Digital Collections and Curation Services Department.  Jen will complete her masters degree in December 2021. 

The Digital Production Center (DPC) is thrilled to announce that work is underway on a 3-year long National Endowment for the Humanities (NEH) grant-funded project to digitize the entirety of Behind the Veil: Documenting African-American Life in the Jim Crow South, an oral history project that produced 1,260 interviews spanning more than 1,800 audio cassette tapes. Accompanying the 2,000 plus hours of audio is a sizable collection of visual materials (e.g.- photographic prints and slides) that form a connection with the recorded voices.

We are here to summarize the logistical details relating to the digitization of this incredible collection. To learn more about its historical significance and the grant that is funding this project, titled “Documenting African American Life in the Jim Crow South: Digital Access to the Behind the Veil Project Archive,” please take some time to read the July announcement written by John Gartrell, Director of the John Hope Franklin Research Center and Principal Investigator for this project. Co-Principal Investigator of this grant is Giao Luong Baker, Digital Production Services Manager.

Digitizing Behind the Veil (BTV) will require, in part, the services of outside vendors to handle the audio digitization and subsequent captioning of the recordings. While the DPC regularly digitizes audio recordings, we are not equipped to do so at this scale (while balancing other existing priorities). The folks at Rubenstein Library have already been hard at work double checking the inventory to ensure that each cassette tape and case are labeled with identifiers. The DPC then received the tapes, filling 48 archival boxes, along with a digitization guide (i.e. – an Excel spreadsheet) containing detailed metadata for each tape in the collection. Upon receiving the tapes, DPC staff set to boxing them for shipment to the vendor. As of this writing, the boxes are snugly wrapped on a pallet in Perkins Shipping & Receiving, where they will soon begin their journey to a digital format.

The wait has begun! In eight to twelve weeks we anticipate receiving the digital files, at which point we will perform quality control (QC) on each one before sending them off for captioning. As the captions are returned, we will run through a second round of QC. From there, the files will be ingested into the Duke Digital Repository, at which point our job is complete. Of course, we still have the visual materials to contend with, but we’ll save that for another blog! 

As we creep closer to the two-year mark of the COVID-19 pandemic and the varying degrees of restrictions that have come with it, the DPC will continue to focus on fulfilling patron reproduction requests, which have comprised the bulk of our work for some time now. We are proud to support researchers by facilitating digital access to materials, and we are equally excited to have begun work on a project of the scale and cultural impact that is Behind the Veil. When finished, this collection will be accessible for all to learn from and meditate on—and that’s what it’s all about. 

 

FFV1: The Gains of Lossless

One of the greatest challenges to digitizing analog moving-image sources such as videotape and film reels isn’t the actual digitization. It’s the enormous file sizes that result, and the high costs associated with storing and maintaining those files for long-term preservation. For many years, Duke Libraries has generated 10-bit uncompressed preservation master files when digitizing our vast inventory of analog videotapes.

Unfortunately, one hour of uncompressed video can produce a 100 gigabyte file. That’s at least 50 times larger than an audio preservation file of the same duration, and about 1000 times larger than most still image preservation files. That’s a lot of data, and as we digitize more and more moving-image material over time, the long-term storage costs for these files can grow exponentially.

To help offset this challenge, Duke Libraries has recently implemented the FFV1 video codec as its primary format for moving image preservation. FFV1 was first created as part of the open-source FFmpeg software project, and has been developed, updated and improved by various contributors in the Association of Moving Image Archivists (AMIA) community.

FFV1 enables lossless compression of moving-image content. Just like uncompressed video, FFV1 delivers the highest possible image resolution, color quality and sharpness, while avoiding the motion compensation and compression artifacts that can occur with “lossy” compression. Yet, FFV1 produces a file that is, on average, 1/3 the size of its uncompressed counterpart.

sleeping bag
FFV1 produces a file that is, on average, 1/3 the size of its uncompressed counterpart. Yet, the audio & video content is identical, thanks to lossless compression.

The algorithms used in lossless compression are complex, but if you’ve ever prepared for a fall backpacking trip, and tightly rolled your fluffy goose-down sleeping bag into one of those nifty little stuff-sacks, essentially squeezing all the air out of it, you just employed (a simplified version of) lossless compression. After you set up your tent, and unpack your sleeping bag, it decompresses, and the sleeping bag is now physically identical to the way it was before you packed.

Yet, during the trek to the campsite, it took up a lot less room in your backpack, just like FFV1 files take up a lot less room in our digital repository. Like that sleeping bag, FFV1 lossless compression ensures that the compressed video file is mathematically identical to it’s pre-compressed state. No data is “lost” or irreversibly altered in the process.

Duke Libraries’ Digital Production Center utilizes a pair of 6-foot-tall video racks, which house a current total of eight videotape decks, comprised of a variety of obsolete formats such as U-matic (NTSC), U-matic (PAL), Betacam, DigiBeta, VHS (NTSC) and VHS (PAL, Secam). Each deck is converted from analog to digital (SDI) using Blackmagic Design Mini Converters.

The SDI signals are sent to a Blackmagic Design Smart Videohub, which is the central routing center for the entire system. Audio mixers and video transcoders allow the Digitization Specialist to tweak the analog signals so the waveform, vectorscope and decibel levels meet broadcast standards and the digitized video is faithful to its analog source. The output is then routed to one of two Retina 5K iMacs via Blackmagic UltraStudio devices, which convert the SDI signal to Thunderbolt 3.

FFV1 video digitization in progress in the Digital Production Center.

Because no major company (Apple, Microsoft, Adobe, Blackmagic, etc.) has yet adopted the FFV1 codec, multiple foundational layers of mostly open-source systems software had to be installed, tested and tweaked on our iMacs to make FFV1 work: Apple’s Xcode, Homebrew, AMIA’s vrecord, FFmpeg, Hex Fiend, AMIA’s ffmprovisr, GitHub Desktop, MediaInfo, and QCTools.

FFV1 operates via terminal command line prompts, so some understanding of programming language is helpful to enter the correct prompts, and be able to decipher the terminal logs.

The FFV1 files are “wrapped” in the open source Matroska (.mkv) media container. Our FFV1 scripts employ several degrees of quality-control checks, input logs and checksums, which ensure file integrity. The files can then be viewed using VLC media player, for Mac and Windows. Finally, we make an H.264 (.mp4) access derivative from the FFV1 preservation master, which can be sent to patrons, or published via Duke’s Digital Collections Repository.

An added bonus is that, not only can Duke Libraries digitize analog videotapes and film reels in FFV1, we can also utilize the codec (via scripting) to target a large batch of uncompressed video files (that were digitized from analog sources years ago) and make much smaller FFV1 copies, that are mathematically lossless. The script runs checksums on both the original uncompressed video file, and its new FFV1 counterpart, and verifies the content inside each container is identical.

Now, a digital collection of uncompressed masters that took up 9 terabytes can be deleted, and the newly-generated batch of FFV1 files, which only takes up 3 terabytes, are the new preservation masters for that collection. But no data has been lost, and the content is identical. Just like that goose-down sleeping bag, this helps the Duke University budget managers sleep better at night.

We’re hiring!

The Digital Production Center (DPC) is looking to hire a Digitization Specialist to join our team! The DPC team is on the forefront of enabling students, teachers, and researchers to continue their research by digitizing materials from our library collections.  We get to work with a variety of unique and rare materials (in a multitude of formats), and we use professional equipment to get the work done. Imagine working on digitizing papyri and comic books – the spectrum is far and wide! Get a glimpse of the collections that have been digitized by DPC staff by checking out our Duke Digital Collections.

Also, the people are really nice (and right now, we’re working in a socially distanced manner)!

More information about the job description can be found here. The successful candidate should be detailed-oriented, possess excellent organizational, project management skills, have scanning experience, and be able to work independently and effectively in a team environment. This position is part of the Digital Collections and Curation Services department and will report to the Digital Production Services manager.

More information about Duke’s benefit package can be found at https://hr.duke.edu/benefits. For more information and to apply, please submit an electronic resume, cover letter, and a list of 3 references to https://library.duke.edu/about/jobs/digitizationspecialist. Review of applications will begin immediately and will continue until the position is filled.

Access for One, Access for All: DPC’s Approach towards Folder Level Digitization

Earlier this year and prior to the pandemic, Digital Production Center (DPC) staff piloted an alternative approach to digitize patron requests with the Rubenstein Library’s Research Services (RLRS) team. The previous approach was focused on digitizing specific items that instruction librarians and patrons requested, and these items were delivered directly to that person. The alternative strategy, the Folder Level digitization approach, involves digitizing the contents of the entire folder that the item is contained in, ingesting these materials to the Duke Digital Repository (to enable Duke Library staff to retrieve these items), and when possible, publishing these materials so that they are available to anyone with internet access. This soft launch prepared us for what is now an all-hands-on-deck-but-in-a-socially-distant-manner digitization workflow.

Giao Luong Baker assessing folders in the DPC.

Since returning to campus for onsite digitization in late June, the DPC’s primary focus has been to perfect and ramp up this new workflow. It is important to note that the term “folder” in this case is more of a concept and that its contents and their conditions vary widely. Some folders may have 2 pages, other folders have over 300 pages. Some folders consists of pamphlets, notebooks, maps, papyri, and bound items. All this to say that a “folder” is a relatively loose term.

Like many initiatives at Duke Libraries, Folder Level Digitization is not just a DPC operation, it is a collaborative effort. This effort includes RLRS working with instructors and patrons to identify and retrieve the materials. RLRS also works with Rubenstein Library Technical Services (RLTS) to create starter digitization guides, which are the building blocks for our digitization guide. Lastly, RLRS vets the materials and determines their level of access. When necessary, Duke Library’s Conservation team steps in to prepare materials for digitization. After the materials are digitized, ingest and metadata work by the Digital Collections and Curation Services as well as the RLTS teams ensure that the materials are preserved and available in our systems.

Kristin Phelps captures a color target.

Doing this work in the midst of a pandemic requires that DPC work closely with the Rubenstein Library Access Services Reproduction Team (a section of RLRS) to track our workflow using a Google Doc. We track the point where the materials are identified by RLRS, through multiple quarantine periods, scanning, post processing, file delivery, to ingest. Also, DPC staff are digitizing in a manner that is consistent with COVID-19 guidelines. Materials are quarantined before and after they arrive at the DPC, machines and workspaces are cleaned before and after use, capture is done in separate rooms, and quality control is done off site with specialized calibrated monitors.

Since we started Folder Level digitization, the DPC has received close to 200 unique Instruction and Patron requests from RLRS. As of the publication of this post, 207 individual folders (an individual request may contain several folders) have been digitized. In total, we’ve scanned and quality controlled over 26,000 images since we returned to campus!

By digitizing entire folders, we hope this will allow for increased access to the materials without risking damage through their physical handling. So far we anticipate that 80 new digital collections will be ingested to the Duke Digital Repository. This number will only grow as we receive more requests. Folder Level Digitization is an exciting approach towards digital collection development, as it is directly responsive to instruction and researcher needs. With this approach, it is access for one, access for all!

Hope Harvested

This began as a quest for images of people engaging in recreational activities. Facing copious time indoors with limited places to go, many are looking for respite. I thought it would be uplifting to find pictures of people having fun. While combing through Duke University Libraries’ numerous digital collections in search of such images, several photos caught my eye. I clicked through hundreds of images reading their captions and summaries. Driven to delve deeper into collections for the story behind those smiling faces. As I sought these stories, I recalled the words of James Baldwin:

You think your pain and your heartbreak are unprecedented in the history of the world, but then you read.

Here were the lived experiences of people striving, aspiring, and persevering.

What started as a search for people pursuing pastimes quickly pivoted. It transformed into a search for people – smiling, laughing and hoping despite their circumstances. Presented below is a small harvest of photographs that inspired this post, including embedded links to their collections. As they did for me, I hope these photos may serve as a gateway to explore these inspired collections.

This image is from a series of photographs taken by James Karales between 1953 and 1957 in Rendville, Ohio, a small mining town which was one of the first racially integrated towns in the U.S.

 

African would-be immigrants play soccer in an enclosed compound at the Safi detention centre outside Valletta July 15, 2008. Around 1,500 illegal immigrants are currently held in detention in Malta for periods of up to 18 months. Though their intention was to reach Italy, most found themselves in Malta when they were rescued by the Maltese Armed Forces when they found themselves in difficulties while on their way to reach European soil from Africa.

 

Men eating at cooperative farm, central Cuba

Casting a Critical Eye on the Hayti-Elizabeth Street Renewal Area Maps

In 2019, one of the digital collections we made available to the public was a small set of architectural maps and plans titled the ‘Hayti-Elizabeth Street Renewal Area’. The short description of the maps indicates they ‘depict existing and proposed structures and modifications to the Hayti neighborhood in Durham, NC.’ Sounds pretty benign, right? Perhaps even kind of hopeful, given the word ‘renewal’?

Hayti-Elizabeth Street Renewal Area, Existing Land Use Map

Nope. This anodyne description does not tell the story of the harm caused by the Durham Urban Renewal project of the 1960s and 1970s. The Durham Redevelopment Commission intended to eliminate ‘urban blight’ via this project, which ultimately resulted in the destruction of more than 4,000 households and 500 businesses in predominantly African American areas of the city. The Hayti District, once a flourishing and self-sufficient neighborhood filled with Black-owned businesses, was largely demolished, divided, and effectively severed from what is now downtown Durham by the construction of NC Highway 147. 

Bull City 150, a “public history, geography and community engagement project” based here at Duke University, hosts a suite of excellent multi-media public history exhibitions about housing inequality in Durham on its website. One of these is Dismantling Hayti, which focuses in particular on the effects of urban renewal on the neighborhood and the city.

Dismantling Hayti, Bull City 150

But this story of so-called urban renewal is not just about Durham – it’s about the United States as a whole. From the 1950s to the 1980s, municipalities across the country demolished roughly 7.5 million dwelling units, with a vastly disproportionate impact on Black and low-income neighborhoods, in the name of revitalization. Bulldozing for highway corridors was frequently a part of urban renewal projects, happening in San Francisco, Memphis, Boston, Atlanta, Syracuse, Baltimore, everywhere in the country – the list goes on and on. And it includes Saint Paul, Minnesota, the city where, mourning and protesting the killing of yet another Black person at the hands of a white police officer, thousands of people occupied Interstate 94 in recent weeks, marching from the state capitol to Minneapolis, over a highway that was once the African American neighborhood of Rondo.  

Urban renewal projects led to what social psychiatrist Dr. Mindy Fulilove refers to as root shock – “a traumatic stress reaction related to the destruction of one’s emotional ecosystem”. This is but one thread in the fabric of white supremacy out of which our country was woven, among other twentieth century practices of redlining, discriminatory mortgage lending practices, denial of access to unemployment benefits, and rampant Jim Crow laws, which are still causing harm today. This is why it is important to interrogate the historical context of resources like the Hayti-Elizabeth Street Renewal Area maps – we should all accept the invitation extended on the Bull City 150 website to Durhamites to “reckon with the racial and economic injustices of the past 150 years and commit to building a more equitable future”.

ArcLight Migration: A Status Update After Three Months of Work

On January 20, 2020, we kicked off our first development sprint for implementing ArcLight at Duke as our new finding aids / collection guides platform. We thought our project charter was solid: thorough, well-vetted, with a reasonable set of goals. In the plan was a roadmap identifying a July 1, 2020 launch date and a list of nineteen high-level requirements. There was nary a hint of an impending global pandemic that could upend absolutely everything.

The work wasn’t supposed to look like this, carried out by zooming virtually into each other’s living rooms every day. Code sessions and meetings now require navigating around child supervision shifts and schooling-from-home responsibilities. Our new young office-mates occasionally dance into view or within earshot during our calls. Still, we acknowledge and are grateful for the privilege afforded by this profession to continue to do our work remotely from safe distance.

So, a major shoutout is due to my colleagues in the trenches of this work overcoming the new unforeseen constraints around it, especially Noah Huffman, David Chandek-Stark, and Michael Daul. Our progress to date has only been possible through resilience, collaboration, and willingness to keep pushing ahead together.

Three months after we started the project, we remain on track for a summer 2020 launch.

As a reminder, we began with the core open-source ArcLight platform (demo available) and have been building extensions and modifications in our local application in order to accommodate Duke needs and preferences. With the caveat that there’ll be more changes coming over the next couple months before launch, I want to provide a summary of what we have been able to accomplish so far and some issues we have encountered along the way. Duke staff may access our demo app (IP-restricted) for an up-to-date look at our work in progress.

Homepage

Homepage design for Duke’s ArcLight finding aids site.
  • Duke Branding. Aimed to make an inviting front door to the finding aids consistent with other modern Duke interfaces, similar to–yet distinguished enough from–other resources like the catalog, digital collections, or Rubenstein Library website.
  • Featured Items. Built a configurable set of featured items from the collections (with captions), to be displayed randomly (actual selections still in progress).
  • Dynamic Content. Provided a live count of collections; we might add more indicators for types/counts of materials represented.

Layout

A collection homepage with a sidebar for context navigation.
  • Sidebar. Replaced the single-column tabbed layout with a sidebar + main content area.
  • Persistent Collection Info. Made collection & component views more consistent; kept collection links (Summary, Background, etc.) visible/available from component pages.
  • Width. Widened the largest breakpoint. We wanted to make full use of the screen real estate, especially to make room for potentially lengthy sidebar text.

Navigation

Component pages contextualized through a sidebar navigator and breadcrumb above the main title.
  • Hierarchical Navigation. Restyled & moved the hierarchical tree navigation into the sidebar. This worked well functionally in ArcLight core, but we felt it would be more effective as a navigational aid when presented beside rather than below the content.
  • Tooltips & Popovers. Provided some additional context on mouseovers for some navigational elements.

    Mouseover context in navigation.
  • List Child Components. Added a direct-child list in the main content for any series or other component. This makes for a clear navigable table of what’s in the current series / folder / etc. Paginating it helps with performance in cases where we might have 1,000+ sibling components to load.
  • Breadcrumb Refactor. Emphasized the collection title. Kept some indentation, but aimed for page alignment/legibility plus a balance of emphasis between current component title and collection title.

    Breadcrumb trail to show the current component’s nesting.

Search Results

Search results grouped by collection, with keyword highlighting.
  • “Group by Collection” as the default. Our stakeholders were confused by atomized components as search results outside of the context of their collections, so we tried to emphasize that context in the default search.
  • Revised search result display. Added keyword highlighting within result titles in Grouped or All view. Made Grouped results display checkboxes for bookmarking & digitized content indicators.
  • Advanced Search. Kept the global search box simple but added a modal Advanced search option that adds fielded search and some additional filters.

Digital Objects Integration

Digital objects from the Duke Digital Repository are presented inline in the finding aid component page.
  • DAO Roles. Indexed the @role attribute for <dao> elements; we used that to call templates for different kinds of digital content
  • Embedded Object Viewers. Used the Duke Digital Repository’s embed feature, which renders <iframe>s for images and AV.

Indexing

  • Whitespace compression. Added a step to the pipeline to remove extra whitespace before indexing. This seems to have slightly accelerated our time-to-index rather than slow it down.
  • More text, fewer strings. We encountered cases where note-like fields indexed as strings by ArcLight core (e.g., <scopecontent>) needed to be converted to text because we had more than 32,766 bytes of data (limit for strings) to put in them. In those cases, finding aids were failing to index.
  • Underscores. For the IDs that end up in a URL for a component, we added an underscore between the finding aid slug and the component ID. We felt these URLs would look cleaner and be better for SEO (our slugs often contain names).
  • Dates. Changed the date normalization rules (some dates were being omitted from indexing/display)
  • Bibliographic ID. We succeeded in indexing our bibliographic IDs from our EADs to power a collection-level Request button that leads a user to our homegrown requests system.

Formatting

  • EAD -> HTML. We extended the EAD-to-HTML transformation rules for formatted elements to cover more cases (e.g., links like <extptr> & <extref> or other elements like <archref> & <indexentry>)

    Additional formatting and link render rules applied.
  • Formatting in Titles. We preserved bold or italic formatting in component titles.

ArcLight Core Contributions

  • We have been able to contribute some of our code back to the ArcLight core project to help out other adopters.

Setting the Stage

The behind-the-scenes foundational work deserves mention here — it represents some of the most complex and challenging aspects of the project.  It makes the application development driving the changes I’ve shared above possible.

  • Built separate code repositories for our Duke ArcLight application and our EAD data
  • Gathered a diverse set of 40 representative sample EADs for testing
  • Dockerized our Duke ArcLight app to simplify developer environment setup
  • Provisioned a development/demo server for sharing progress with stakeholders
  • Automated continuous integration and deployment to servers using GitLabCI
  • Performed targeted data cleanup
  • Successfully got all 4,000 of our finding aids indexed in Solr on our demo server

Our team has accomplished a lot in three months, in large part due to the solid foundation the ArcLight core software provides. We’re benefiting from some amazing work done by many, many developers who have contributed their expertise and their code to the Blacklight and ArcLight codebases over the years. It has been a real pleasure to be able to build upon an open source engine– a notable contrast to our previous practice of developing everything in-house for finding aids discovery and access.

Still, much remains to be addressed before we can launch this summer.

The Road Ahead

Here’s a list of big things we still plan to tackle by July (other minor revisions/bugfixes will continue as well)…

  • ASpace -> ArcLight. We need a smoother publication pipeline to regularly get data from ArchivesSpace indexed into ArcLight.
  • Access & Use Statements. We need to revise the existing inheritance rules and make sure these statements are presented clearly. It’s especially important when materials are indeed restricted.
  • Relevance Ranking. We know we need to improve the ranking algorithm to ensure the most relevant results for a query appear first.
  • Analytics. We’ll set up some anonymized tracking to help monitor usage patterns and guide future design decisions.
  • Sitemap/SEO. It remains important that Google and other crawlers index the finding aids so they are discoverable via the open web.
  • Accessibility Testing / Optimization. We aim to comply with WCAG2.0 AA guidelines.
  • Single-Page View. Many of our stakeholders are accustomed to a single-page view of finding aids. There’s no such functionality baked into ArcLight, as its component-by-component views prioritize performance. We might end up providing a downloadable PDF document to meet this need.
  • More Data Cleanup. ArcLight’s feature set (especially around search/browse) reveals more places where we have suboptimal or inconsistent data lurking in our EADs.
  • More Community Contributions. We plan to submit more of our enhancements and bugfixes for consideration to be merged into the core ArcLight software.

If you’re a member of the Duke community, we encourage you to explore our demo and provide feedback. To our fellow future ArcLight adopters, we would love to hear how your implementations or plans are shaping up, and identify any ways we might work together toward common goals.

Stay safe, everyone!

Labor in the Time of Coronavirus

The Coronavirus pandemic has me thinking about labor–as a concept, a social process, a political constituency, and the driving force of our economy–in a way that I haven’t in my lifetime. It’s become alarmingly clear (as if it wasn’t before) that we all need food, supplies, and services to survive past next week, and that there are real human beings out there working to produce and deliver these things. No amount of entrepreneurship, innovation, or financial sleight of hand will help us through the coming months if people are not working to provide the basic requirements for life as we know it.

This blog post draws from  images in our digitized library collections to pay tribute to all of the essential workers who are keeping us afloat during these challenging times. As I browsed these photographs and mused on our current situation, a few important and oft-overlooked questions came to mind.

Who grows our food? Where does it come from and how is it processed? How does it get to us?

Bell pepper pickers, 1984 June. Paul Kwilecki Photographs.
Prisoners at the county farm killing hogs, 1983 Mar. Paul Kwilecki Photographs.
Worker atop rail car loading corn from storage tanks, 1991 Sept. Paul Kwilecki Photographs.
Manuel Molina, mushroom farm worker, Kennett Square, PA 1981. Frank Espada Photographs.

What kind of physical environment do we work in and how does that affect us?

Maids in a room of the Stephen Decatur Hotel shortly before it was torn down, 1970. Paul Kwilecki Photographs
Worker in pit preparing to weld. Southeastern Minerals Co. Bainbridge, 1991 Aug. Paul Kwilecki Photographs
Loggers in the woods near Attapulgus, 1978 Feb. Paul Kwilecki Photographs.

How do we interact with machines and technology in our work? Can our labor be automated or performed remotely?

Worker signaling for more logs. Elberta Crate and Box. Co. Bainbridge, 1991 Sept. Paul Kwilecki Photographs.
Worker, Williamson-Dickie plant. Bainbridge, 1991 Sept. Paul Kwilecki Photographs.
Machine operator watching computer controlled lathe, 1991 July. Paul Kwilecki Photographs.
Worker at her machine. Elberta Crate and Box Co. Bainbridge, 1981 Nov. Paul Kwilecki Photographs.

What equipment and clothing do we need to work safely and productively?

Cotton gin worker wearing safety glass and ear plugs for noise protection. Decatur Gin Co., 1991 Sept. Paul Kwilecki Photographs.
Worker operating bagging machine. Flint River Mills. Bainbridge, 1991 July. Paul Kwilecki Photographs.
Worker at State Dock, 1992 July. Paul Kwilecki Photographs.

Are we paid fairly for our work? How do relative wages for different types of work reflect what is valued in our society?

No known title. William Gedney Photographs.

How we think about and respond to these questions will inform how we navigate the aftermath of this ongoing crisis and whether or not we thrive into the future. As we celebrate International Workers’ Day on May 1 and beyond, I hope everyone will take some time to think about what labor means to them and to our society as a whole.