Earlier this year and prior to the pandemic, Digital Production Center (DPC) staff piloted an alternative approach to digitize patron requests with the Rubenstein Library’s Research Services (RLRS) team. The previous approach was focused on digitizing specific items that instruction librarians and patrons requested, and these items were delivered directly to that person. The alternative strategy, the Folder Level digitization approach, involves digitizing the contents of the entire folder that the item is contained in, ingesting these materials to the Duke Digital Repository (to enable Duke Library staff to retrieve these items), and when possible, publishing these materials so that they are available to anyone with internet access. This soft launch prepared us for what is now an all-hands-on-deck-but-in-a-socially-distant-manner digitization workflow.
Since returning to campus for onsite digitization in late June, the DPC’s primary focus has been to perfect and ramp up this new workflow. It is important to note that the term “folder” in this case is more of a concept and that its contents and their conditions vary widely. Some folders may have 2 pages, other folders have over 300 pages. Some folders consists of pamphlets, notebooks, maps, papyri, and bound items. All this to say that a “folder” is a relatively loose term.
Like many initiatives at Duke Libraries, Folder Level Digitization is not just a DPC operation, it is a collaborative effort. This effort includes RLRS working with instructors and patrons to identify and retrieve the materials. RLRS also works with Rubenstein Library Technical Services (RLTS) to create starter digitization guides, which are the building blocks for our digitization guide. Lastly, RLRS vets the materials and determines their level of access. When necessary, Duke Library’s Conservation team steps in to prepare materials for digitization. After the materials are digitized, ingest and metadata work by the Digital Collections and Curation Services as well as the RLTS teams ensure that the materials are preserved and available in our systems.
Doing this work in the midst of a pandemic requires that DPC work closely with the Rubenstein Library Access Services Reproduction Team (a section of RLRS) to track our workflow using a Google Doc. We track the point where the materials are identified by RLRS, through multiple quarantine periods, scanning, post processing, file delivery, to ingest. Also, DPC staff are digitizing in a manner that is consistent with COVID-19 guidelines. Materials are quarantined before and after they arrive at the DPC, machines and workspaces are cleaned before and after use, capture is done in separate rooms, and quality control is done off site with specialized calibrated monitors.
Since we started Folder Level digitization, the DPC has received close to 200 unique Instruction and Patron requests from RLRS. As of the publication of this post, 207 individual folders (an individual request may contain several folders) have been digitized. In total, we’ve scanned and quality controlled over 26,000 images since we returned to campus!
By digitizing entire folders, we hope this will allow for increased access to the materials without risking damage through their physical handling. So far we anticipate that 80 new digital collections will be ingested to the Duke Digital Repository. This number will only grow as we receive more requests. Folder Level Digitization is an exciting approach towards digital collection development, as it is directly responsive to instruction and researcher needs. With this approach, it is access for one, access for all!
The Coronavirus pandemic has me thinking about labor–as a concept, a social process, a political constituency, and the driving force of our economy–in a way that I haven’t in my lifetime. It’s become alarmingly clear (as if it wasn’t before) that we all need food, supplies, and services to survive past next week, and that there are real human beings out there working to produce and deliver these things. No amount of entrepreneurship, innovation, or financial sleight of hand will help us through the coming months if people are not working to provide the basic requirements for life as we know it.
This blog post draws from images in our digitized library collections to pay tribute to all of the essential workers who are keeping us afloat during these challenging times. As I browsed these photographs and mused on our current situation, a few important and oft-overlooked questions came to mind.
Who grows our food? Where does it come from and how is it processed? How does it get to us?
What kind of physical environment do we work in and how does that affect us?
How do we interact with machines and technology in our work? Can our labor be automated or performed remotely?
What equipment and clothing do we need to work safely and productively?
Are we paid fairly for our work? How do relative wages for different types of work reflect what is valued in our society?
How we think about and respond to these questions will inform how we navigate the aftermath of this ongoing crisis and whether or not we thrive into the future. As we celebrate International Workers’ Day on May 1 and beyond, I hope everyone will take some time to think about what labor means to them and to our society as a whole.
‘Tis the time of year for top 10 lists. Here at Duke Digital Collections HQ, we cannot just pick 10, because all our digital collections are tops! What follows is a list of all the digital collections we have launched for public access this calendar year.
Our newest collections include a range of formats and subject areas from 19th Century manuscripts to African American soldiers photograph albums to Duke Mens Basketball posters to our first Multispectral Images of papyrus to be ingested into the repository. We also added new content to 4 existing digital collections. Lastly, our platform migration is still ongoing, but we made some incredible progress this year as you will see below. Our goal is to finish the migration by the end of 2020.
New Digital Collections
African American Soldiers Photo Albums (browse all 8 or 1 by 1 using the links below):
In 2020, we’ll be making significant changes to our systems supporting archival discovery and access. The main impetus for this shift is that our current platform has grown outdated and is no longer sustainable going forward. We intend to replace our platform with ArcLight, open source software backed by a community of peer institutions.
Finding Aids at Duke: Innovations Past
At Duke, we’re no strangers to pushing the boundaries of archival discovery through advances in technology. Way back in the mid 1990s, Duke was among pioneers rendering SGML-encoded finding aids into HTML. For most of the 90s and aughts we used a commercial platform, but we decided to develop our own homegrown finding aids front-end in 2007 (using the Apache Cocoon framework). We then replaced it in 2012 with another in-house platform built on the Django web framework.
Since going home-grown in 2007, we have been able to find some key opportunities to innovate within our platforms. Here are a few examples:
Our current platform was pretty good for its time, but a lot has changed in eight years. The way we build web applications today is much different than it used to be. And beyond desiring a modern toolset, there are major concerns going forward around size, search/indexing, and support.
We have some enormous finding aids. And we have added more big ones over the years. This causes problems of scale, particularly with an interface like ours that renders each collection as a single web page with all of the text of its contents written in the markup. One of our finding aids contains over 21,000 components; all told it is 9MB of raw EAD transformed into 15MB of HTML.
No amount of caching or server wizardry can change the fact that this is simply too much data to be delivered and rendered in a single webpage, especially for researchers in lower-bandwidth conditions. We need a solution that divides the data for any given finding aid into smaller payloads.
Google Custom Search does a pretty nice job of relevance ranking and highlighting where in a finding aid a term matches (after all, that’s Google’s bread-and-butter). However, when used to power search in an application like this, it has some serious limitations. It only returns a maximum of one hundred results per query. Google doesn’t index 100% of the text, especially for our larger finding aids. And some finding aids are just mysteriously omitted despite our best efforts optimizing our markup for SEO and providing a sitemap.
We need search functionality where we have complete control of what gets indexed, when, and how. And we need assurance that the entirety of the materials described will be discoverable.
This is a familiar story. Homegrown applications used for several years by organizations with a small number of developers and a large number of projects to support become difficult to sustain over time. We have only one developer remaining who can fix our finding aids platform when it breaks, or prevent it from breaking when the systems around it change. Many of the software components powering the system are at or nearing end-of-life and they can’t be easily upgraded.
Where to Go From Here?
It has been clear for awhile that we would soon need a new platform for finding aids, but not as clear what platform we should pursue. We had been eyeing the progress of two promising open source community-built solutions emerging from our peer institutions: the ArchivesSpace Public UI (PUI), and ArcLight.
Over 2018-19, my colleague Noah Huffman and I co-led a project to install pilot instances of the ASpace PUI and ArcLight, index all of our finding aids in them, and then evaluate the platforms for their suitability to meet Duke’s needs going forward. The project involved gathering feedback from Duke archivists, curators, research services staff, and our digital collections implementation team. We looked at six criteria: 1) features; 2) ease of migration/customization; 3) integration with other systems; 4) data cleanup considerations; 5) impact on existing workflows; 6) sustainability/maintenance.
There’s a lot to like about both the ASpace PUI and ArcLight. Feature-wise, they’re fairly comparable. Both are backed by a community of talented, respected peers, and either would be a suitable foundation for a usable, accessible interface to archives. In the end, we recommended that Duke pursue ArcLight, in large part due to its similarity to so much of the other software in our IT portfolio.
Duke is certainly not alone in our desire to replace an outdated, unsustainable homegrown finding aids platform, and intention to use ArcLight as a replacement.
This fall, with tremendous leadership from Stanford University Libraries, five universities collaborated on developing the ArcLight software further to address shared needs. Over a nine week work cycle from August to October, we had the good fortune of working alongside Stanford, Princeton, Michigan, and Indiana. The team addressed needs on several fronts, especially: usability, accessibility, indexing, context/navigation, and integrations.
Three Duke staff members participated: I was a member of the Development Team, Noah Huffman a member of the Product Owners Team, and Will Sexton on the Steering Group.
The work cycle is complete and you can try out the current state of the core ArcLight demo application. It includes several finding aids from each of the participating partner institutions. Here are just a few highlights that have us excited about bringing ArcLight to Duke:
Here’s a final demo video (37 min) that nicely summarizes the work completed in the fall 2019 work cycle.
Lighting the Way
With some serious momentum from the fall ArcLight work cycle and plans taking shape to implement the software in 2020, the Duke Libraries intend to participate in the Stanford-led, IMLS grant-funded Lighting the Way project, a platform-agnostic National Forum on Archival Discovery and Delivery. Per the project website:
Lighting the Way is a year-long project led by Stanford University Libraries running from September 2019-August 2020 focused on convening a series of meetings focused on improving discovery and delivery for archives and special collections.
Coming in 2020: ArcLight Implementation at Duke
There’ll be much more share about this in the new year, but we are gearing up now for a 2020 ArcLight launch at Duke. As good as the platform is now out-of-the-box, we’ll have to do additional development to address some local needs, including:
An efficient preview/publication workflow
Digital object viewing / repository integration
Some data cleanup
Building these local customizations will be time well-spent. We’ll also look for more opportunities to collaborate with peers and contribute code back to the community. The future looks bright for Duke with ArcLight lighting the way.
The ongoing tensions between academic institutions and publishers have been escalating the last few months, but those tensions have existed for many years. The term “Big Deal” has been coined to describe a long-standing, industry-wide practice of journal bundling that forces libraries to subscribe to unwanted and unneeded publications rather than paying more for a limited number of individual subscriptions. This is a practice you see in other industries – for example, cable packages that provide hundreds of channels, even if you only want one or two specific channels.
What is especially problematic in higher education is that academics produce and review the content that gets published in the journals (for free), and then the universities have to pay the publishers a subscription fee to access the content. Imagine if YouTube required a subscription fee to watch any videos, including the ones you had posted. It’s a system that makes research harder to access and inhibits global scientific progress, all so publishers can earn an enormous profit margin.
Right now, academic publishing is controlled by five publishers (the “Big Five”) – a monopoly that makes it very difficult for libraries to negotiate better deals. Only very large organizations or consortia, like the University of California, have been able to start pushing back against the system. It will likely take large shake-ups like this for any large changes to take hold, but it in the meantime there may be ways to situate ourselves for making better purchasing decisions.
At Duke, we often review our usage of specific journal titles as we prepare to make purchasing decisions. Usage data comes in a variety of forms, but the most popular are counts of Duke views and downloads that come directly from the publishers and the number of times Duke authors publish in or cite a particular journal. There are many other kinds of data that might be of interest, however, including Duke participation on editorial boards, usage differences across disciplines, and even whether or not the journal is fully open access. Blending various data sources and optimizing the search decisions for a given budget cycle can be overwhelming.
Last fall, Duke University Libraries decided to propose a project for Duke’s Data+ summer program – a summer research experience in data science for undergraduate students. Our project, “Breaking the Bundle: Analyzing Duke’s Journal Subscriptions“, focuses on Duke’s subscriptions to journals published by Elsevier. The program is in its third week, and our team of two incredibly-sharp undergraduates has been hard at work building and blending our datasets. Our goal by the end of summer is to have a proof-of-concept dashboard that lets collection managers adjust the weights of various usage measures to generate an ideal collection of journals for a particular budget.
It is still very early in the process, but the students have been hard at work and have made great progress. We decided it would be best to develop the analysis software and dashboard using R, a statistical computing project with a rich history and many helpful development tools. In addition to publisher-provided views and downloads, the students have been able to use websites and APIs to collect data on journal open access status, editorial boards, numbers of publications, and numbers of citations. All Data+ teams present publicly on the projects twice during the summer, and we hope to schedule a third talk for a library audience before the end of the program on August 2.
We look forward to seeing what the summer will bring! While this project is just one small step, automating the collection and analysis of journal usage will position us well, both for responsible purchases and for a hopefully-changing publishing landscape.
Looking for something to keep you company on your Summer vacation? Why not direct your devices to a Duke Digital collections! Seriously! Here are a few of the compelling collections we debuted earlier this Spring, and we have have more coming in late June.
These maps and 2 volume report document Durham’s Hayti-Elizabeth st neighborhood infrastructure prior to the construction of the Durham Freeway, as well as the justifications for the redevelopment of the area. This is an excellent resource for folks studying Durham history and/or the urban renewal initiatives of the mid-20th century.
We launched 8 collections of photograph albums created by African American soldiers serving in the military across the world including Japan, Vietnam and Iowa. Together these albums help “document the complexity of the African American military experience” (Bennett Carpenter from his blog post, “War in Black and White: African American Soldiers’ Photograph Albums”).
This photograph album contains pictures taken by Sir Percy Moleworth Sykes during his travels in a mountainous region of Central Asia, now the Xinjiang Uyghur Autonomous Region of China, with his sister, Ella Sykes. According to the collection guide, the album’s “images are large, crisp, and rich with detail, offering views of a remote area and its culture during tensions in the decades following the Russo-Turkish War”.
Our work never stops, and we have several large projects in the works that are scheduled to launch by the end of June. They are the first batch of video recordings from the Memory Project. We are busy migrating the incredible photographs from the Sydney Gamble collection – into the digital repository. Finally there is one last batch of Radio Haiti recordings on the way.
Keeping in touch
We launch new digital collections just about every quarter, and have been investigating new ways to promote our collections as part of an assessment project. We are thinking of starting a newsletter – would you subscribe? What other ways would you like to keep in touch with Duke Digital Collections? Post a comment or contact me directly.
Hello! This is my first blog as the new Digital Production Service Manager, and I’d like to take this opportunity to take you, the reader, through my journey of discovering the treasures that the Duke Digital Collections program offers. To personalize this task, I explored the materials related to my family’s journey to the United States. First, I should contextualize. After migrating from south China in the mid-1800s, my family fled Vietnam in the late 1970s and we left with the bare necessities – mainly food, clothes, and essential documents. All I have now are a few family pictures from that era and vividly told stories from my parents to help me connect the dots of my family’s history.
When I started delving into Duke’s Digital Collections, it was heartening to find materials of China, Vietnam, and even anti-war materials in the U.S. The following are some materials and collections that I’d like to highlight.
The Sidney D. Gamble Photographs offer over 5,000 photographs of China in the early 20th century. Images of everyday life in China and landscapes are available in this collection.The above image from the Gamble collection, is that of a junk, or houseboat, photographed in the early 1900s. When my family fled Vietnam, fifty people crammed into a similar vessel and sailed in the dead of night along the Gulf of Tonkin. My parents spoke of how they were guided by the moonlight and how fearful they were of the junk catching fire from cooking rice.
The African American Soldier’s Vietnam War photograph album collection offers these gorgeous images of Vietnam. This is the country that was home for multiple generations for my family, and up until the war, it was a good life. I am astounded and grateful that these postcards were collected by an American soldier in the middle of war. Considering that I grew up in Los Angeles, California, I have no sense of the world that my parents inhabited, and these images help me appreciate their stories even more. On the other side of the planet, there were efforts to stop the war and it was intriguing to see a variety of digital collections depicting these perspectives through art and documentary photography. The image below is that of a poster from the Italian Cultural Posters collection depicting Uncle Sam and the Viet Cong.
In addition to capturing street scenes in London, the Ronald Reis Collection, includes images of Vietnam during the war and anti-war effort in the United States. The image below is that of a demonstration in Bryant Park in New York City. I recognize that the conflict was fought on multiple fronts and am grateful for these demonstrations, as they ultimately led to the end of the war.Lastly, the James Karales Photos collection depicts Vietnam during the war. The image below, titled “Soldiers leaving on helicopter” is one that reminds me of my uncle who left with the American soldiers and started a new life in the United States. In 1980, thanks to the Family Reunification Act, the aid of the American Red Cross, and my uncle’s sponsorship, we started a new chapter in America.
Perhaps this is typical of the immigrant experience, but it still is important to put into words. Not every community has the resources and the privilege to be remembered, and where there are materials to help piece those stories together, they are absolutely valued and appreciated. Thank you, Duke University Libraries, for making these materials available.
In anticipation of next Tuesday’s midterm elections, here is a photo gallery of voting-related images from Duke Digital Collections. Click on a photo to view more images from our collections dealing with political movements, voting rights, propaganda, activism, and more!
If you haven’t already taken advantage of early voting, we at Bitstreams encourage you to exercise your right on November 6!
When Duke professor and botanist Henry J. Oosting agreed to take part in an expedition to Greenland in the summer of 1937 his mission was to collect botanical samples and document the region’s native flora. The expedition, organized and led by noted polar explorer Louise Arner Boyd, included several other accomplished scientists of the day and its principal achievement was the discovery and charting of a submarine ridge off of Greenland’s eastern coast.
In a diary he kept during his trip titled “To Greenland in 105 Days, or Why did I ever leave home,” Oosting focuses little on the expedition’s scientific exploits. Instead, he offers a more intimate look into the mundane and, at times, amusing aspects of early polar exploration. Supplementing the diary in the recently published Henry J. Oosting papers digital collection are a handful of digitized nitrate negatives that add visual interest to his arctic (mis)adventures.
Oosting’s journey got off to an inauspicious start when he wrote in his opening entry on June 9, 1937: “Frankly, I’m not particularly anxious to go now that the time has come–adventure of any sort has never been my line–and the thought of the rolling sea gives me no great cheer.” What follows over the next 200 pages or so, by his own account, are the “inane mental ramblings of a simple-minded botanist,” complete with dozens of equally inane marginal doodles.
The Veslekari, the ship chartered by Louise Boyd for the expedition, first encountered sea ice on July 12 just off the east coast of Greenland. As the ship slowed to a crawl and boredom set in among the crew the following day, Oosting wrote in his diary that “Miss Boyd’s story of the polar bear is worth recording.” He then relayed a joke Boyd told the crew: “If you keep a private school and I keep a private school then why does a polar bear sit on a cake of ice…? To keep its privates cool, of course.” For clarification, Oosting added: “She says she has been trying for a long time to get just the right picture to illustrate the story but it’s either the wrong kind of bear or it won’t hold its position.”
When the expedition finally reached the Greenland coast at the end of July, Oosting spent several days exploring the Tyrolerfjord glacier, gathering plant specimens and drying them on racks in the ship’s engine room. On the glacier, Oosting observed an arctic hare, an ermine, and noted that “my plants are accumulating in such quantity.”
As the expedition wore on Oosting grew increasingly frustrated with the daily tedium and with Boyd’s unfailing enthusiasm for the enterprise. “In spite of everything…we are stopping at more or less regular intervals to see what B thinks is interesting,” Oosting wrote on August 19. “I didn’t go ashore this A.M. for a 15 min. stop even after she suggested it–have heard about it 10 times since…I’ll be obliged to go in every time now regardless or there will be no living with this woman. I am thankful, sincerely thankful, there are only 5 more days before we sail for I am thoroughly fed-up with this whole business.”
By late August, the Veslekari and crew headed back east towards Bergen, Norway and eventually Newcastle, England, where Oosting boarded a train for London on September 12. “This sleeping car is the silliest arrangement imaginable,” Oosting wrote, “my opinion of the English has gone down–at least my opinion of their ideas of comfort.” After a brief stint sightseeing around London, Oosting boarded another ship in Southampton headed for New York and eventually home to Durham. “It will be heaven to get back to the peace and quiet of Durham,” Oosting pined on September 14, “I’m developing a soft spot for the lousy old town.”
Oosting arrived home on September 21, where his diary ends. Despite his curmudgeonly tone throughout and his obsession with recording every inconvenience and impediment encountered along the way, it’s clear from other sources that Oosting’s work on the voyage made important contributions to our understanding of arctic plant life.
In The Coast of Northeast Greenland (1948), edited by Louise Boyd and published by the American Geographic Society, Oosting authored a chapter titled “Ecological Notes on the Flora,” in which he meticulously documented the specimens he collected in the arctic. The onset of World War II and concerns over national security delayed publication of Oosting’s findings, but when released, they provided valuable new information about plant communities in the region. While Oosting’s diary reveals a man with little appetite for adventure, his work endures. As the forward to Boyd’s 1948 volume attests: “When travelers can include significant contributions to science, then adventure becomes a notable achievement.”