All posts by Meredith Graham

Data Lost, but not Forgotten

Intern Experience: Kaylee Alexander (Data & Visualization Services)

This post by Kaylee Alexander, 2019 Humanities Unbounded Graduate Assistant, is part of a series on graduate students’ “Intern Experience” at Duke University Libraries. 

With the growing popularity of digital humanities projects, the question of how humanists should manage data, and specifically missing data and data limitations, is of increasing importance. Often the glittering possibilities of integrating technology and data-driven research methods into historical analysis makes us forget that we are still dealing with imperfect information, albeit processed in new and meaningful ways. In my own research on 19th-century funerary monuments in Paris, the issue of survival bias has been pervasive, as very few tombs—only the most expensive—have survived into the present day.

Survival bias occurs when we focus on people or things that have passed through a selection process and overlooking those that haven’t. In 1943, for example, damaged bomber planes returning from combat were being studied to identify areas that needed additional reinforcements. However, these planes had survived. What about those that didn’t? Where had they sustained damage? This was the question posed by statistician Abraham Wald, who argued that damage to returned planes represented not where improvements were needed, but rather where planes could sustain damage and could still return safely. It was the undamaged areas that were more telling.

Diagram showing areas of damage to returned WWII bomber planes (red) and recommended areas for reinforcement based on Wald’s analysis.

Historical studies are, not surprisingly, prone to such survival biases. Objects and documents get lost or damaged; others are not deemed worthy of being kept. Some information is just never recorded. But, just like Wald’s returned bomber planes, what does survive can be used to consider what we’ve lost. This is a concept that I work with all of the time, and a bias that my work specifically tries to overcome through data-driven practices. However, it is not something that I had yet considered in the context of inherited datasets.

As a Humanities Unbounded Graduate Assistant with Duke Libraries’ Data and Visualization Services, I began working with members of the Representing Migration Humanities Lab in preparation for their Data+ project, “Remembering the Middle Passage.” Led by English professor Charlotte Sussman, one of the original goals of the project was to use data representing nearly 36,000 transatlantic slave voyages to see if it would be possible to map a reasonable location for a deep-sea memorial to the transatlantic slave trade. Data on these voyages had been compiled and made openly accessible online by a team of researchers working with the Emory Center for Digital Scholarship (among others). The promises of these data were great; we just had to figure out how to use them.

My primary task was getting to know the data and providing support in preparation for the upcoming Data+ session. So, I began with the Slave Voyages website.

Home page for the Slave Voyages website: https://www.slavevoyages.org/

The landing page for the database boasts that “this digital memorial raises questions about the largest slave trades in history and offers access to the documentation available to answer them.” Here, you can view and download data on these voyages as well as access summary tables and interactive data visualizations, timelines, and maps, allowing users to easily interact with a wealth of information. Clearly labeled columns, filled with rows of data, project an image of endless research possibilities with all the data you could ever need.

Web-based interface to voyages data in the Trans-Atlantic Slave Trade Database.

However, the online interactive database only represents about half of the variables included in the full version of the database, which can be downloaded, but certainly isn’t as user-friendly as the front-facing version. One of the most glaring things I noticed when I first opened this file was all of the empty cells.

Excel sheet showing the full version of the Trans-Atlantic Slave Trade Database downloaded from https://slavevoyages.org/voyage/downloads.

It soon became clear that the online version only included a selection of the most complete variables (many of which were estimates based on original sources).

One of the first things I do when working with a new dataset in my own work is to create an overview of all of my variables and the percentage of records that have each variable. This provides me with useful insights into how complete my data are, and also how reliable certain variables will be for the types of questions I want to ask. I find this to be particularly useful when working with data that I have not compiled myself, even when a codebook already exists, as it helps you to get really quickly familiar with exactly what you have and what might be possible. More often than not, I end up revising my research questions as a result of this process. So, I wondered how this might help the Data+ team set their goals.

While the original questions of the project had been formed around mortality and how to map the experiences of enslaved people on board these voyages, a reconsideration of the data showed how the answers to these questions would only be attainable for a fraction of the voyages in the database—and nothing of any voyages that hadn’t been accounted for.

The question of all this missing data then became an essential part of the research project. How could all these gaps inform us about what isn’t there? Why were data missing, and how could we use this to think more broadly about erasure in the context of the slave trade? If our goal was to memorialize lives lost, how could we best and most appropriately accomplish this given the data we didn’t have?

There is still much work to be done before we can even begin answering these questions, and I leave that in the capable hands of the Data+ team and the Representing Migration Lab. But until then, my take away is this: missing data should not become forgotten data. Knowing what we’re working with, whether it be inherited data or data we’ve constructed, and being aware of the data we’re missing, allows us to reformulate our research objectives in new and more meaningful ways.

Kaylee P. Alexander is a Ph.D. Candidate in the Department of Art, Art History & Visual Studies, where she is also a research assistant with the Duke Art, Law & Markets Initiative (DALMI). Her dissertation research focuses on the visual culture of the cemetery and the market for funerary monuments in nineteenth-century Paris. In the spring of 2019, she served as a Humanities Unbounded graduate assistant with Data and Visualization Services at Duke University Libraries. Follow her on Twitter @kpalex91

Digital Project Profiles: Project Vox

The Digital Project Profiles series features projects that have partnered or worked closely with Duke Libraries’ Digital Scholarship Services (DSS) department. These projects illustrate the kinds of research, pedagogical, and publishing questions that DSS addresses. For assistance with your own project, contact askdigital@duke.edu.

The Project

Project Vox, http://projectvox.library.duke.edu
2014 – ongoing
An open educational resource, created and run primarily by students and volunteers, that provides resources for incorporating early modern women philosophers into research and instruction.

In late spring 2014, a project team formed at Duke University to build a website that could support research and teaching about non-canonical women philosophers, and they launched the Project Vox website in March 2015. From the start, the team has included undergraduates, graduate students, faculty, librarians, and technical staff. The Project Vox website serves as the virtual hub for an international network of scholars to work together in expanding research and teaching beyond the traditional philosophical “canon” and beyond traditional narratives of modern philosophy’s history.

What motivated Project Vox?

Philosophy is a surprisingly static and homogenous discipline. For the past fifty years, the humanities have been dominated by women; yet in philosophy, women make up only one-third of advanced degree recipients. In terms of gender diversity, it aligns more closely with (and in some cases falls below) the historically male-dominated sciences. What could be the reason for this gender gap? Recent studies suggest that undergraduate philosophy courses set the stage for this divide, particularly when the majority of figures studied are male. While the latter half of the 20th century saw many humanities disciplines expand and reform their canons to include marginalized voices, the discipline of philosophy saw little change. The figures and texts that dominated philosophy in the early 20th century persisted into the 21st, despite research demonstrating that a number of early modern women actively engaged with and influenced philosophical discussions with their more famous peers. Ultimately, Project Vox seeks to change the field and the face of philosophy by providing information and materials necessary to incorporate women philosophers into undergraduate instruction.

What makes Project Vox an important case study for digital scholarship?

In addition to its intellectual aims, Project Vox is an important case study because of its successful open-access publication model and the workflows that support it.

Project Vox is an open-access publishing project, run by a predominantly student team, that provides participants with hands-on experience and education in digital publishing while also presenting users worldwide with resources for changing their philosophy research and instruction. As a project with long-term and lofty objectives, Project Vox pursues incremental impact toward its goals while minimizing the costs for sustaining that effort. To do that, the project has placed particular emphasis on outreach and assessment, systematically engaging with the project’s audiences to solicit feedback and encourage participation in the project. To increase the pace of publication and distribute work, the team has developed a collaborative approach to research; Project Vox is now in the process of codifying this workflow for sharing with international partners.

What has working on Project Vox taught us?  

For staff in the Digital Scholarship Services department, Project Vox has provided a wealth of insights into digital publishing and increased our own capacity for advising others wishing to pursue their own publishing projects (a short list of topics is provided below, along with places where you can find more information):

  • Making an open educational resource (OER) discoverable and citable (e.g., using WordPress plugins to display Dublin Core metadata for use in citation management software such as Zotero and Open Graph metadata for sharing on social media; creating a MARC record for the website, making it discoverable in the library catalog)
  • Low-cost, low-maintenance website hosting for academics (using Reclaim Hosting to run the WordPress platform for Project Vox)
  • Gathering alternative metrics for an open-access website (using, for example, the Altmetric Explorer for Institutions tool that’s licensed by Duke University Libraries, or Google Analytics data to monitor site traffic)
  • Research management for teams over time (using Duke-licensed tools like Box for sharing and organizing content)
  • Project management tools and approaches (e.g., use of MeisterTask for project management; Toolkits for provisioning resources to team members; sponsored accounts for providing outside reviewers with access to pre-publication entries)
  • Collaborative workflows for research and publishing (in particular, how to take what has been a predominantly solitary enterprise—humanistic research and writing—and make it a collective effort)

Following are a few lessons learned from the Project Vox team members:

Roy Auh, T’19

“I’ve learned new research techniques by working on a team and it’s been awesome to grow from a research assistant to a lead researcher. […] Now I know the trusted sources and my way around the library and the available databases.”

Jen Semler, T’19

“I now have a better understanding of the many factors (finding sources, translating texts, acquiring images, applying for funding, etc.) that go into a large-scale research project like this. I am often impressed with all the work this team has been able to do and how well the team has communicated.”

Mattia Begali, Romance Studies Lecturing Fellow

“I really had the chance to explore how a community of practice like the Project Vox team interacts and collaborates. Behind the scenes of Project Vox there is a complex digital habitat meant to sustain the workflow of the group. Before joining the team of this project, I was only partially aware of how complex and stratified this digital habitat is. Also, it was fascinating for me to see how the role of each member gets defined by highly specialized practices.”

Liz Crisenbery, PhD Candidate in Musicology

“Being involved with Project Vox and DSS has also pushed me to think about how my research intersects with the digital humanities. As a musicologist who studies opera, I’m keen to incorporate recorded performances into my dissertation; providing open access to the music I write about is very important to facilitate a better understanding of my work.”

Abigail Flanigan, 2016 MLS from the UNC-CH School of Information & Library Science

“In my time as an intern on Project Vox, I learned about these very specific topics—editorial processes for digital publications and altmetrics—but I also learned more broadly about what is takes to create and manage a digital project on this scale. From the legal (researching copyright owners for images) to the technical (building the site’s infrastructure) to the creative (designing the logo), it takes many people with many different skills to build a project like Project Vox.”

More information

If you’d like to be a part of the Project Vox team, there are a number of ways you can get involved:

  • Volunteer
  • Participate in a Field Experience through Digital Scholarship Services
  • Earn course credit (spring 2019)
    • Bass Connections project, EHD 396 for undergraduates and EHD 796 for graduate students
    • Digital Publishing course, ISS 550S

To keep up with Project Vox on social media, please follow them on Facebook and Twitter. For questions about this and other digital scholarship projects or to get advice for your own, contact askdigital@duke.edu.