descriptive image

Sharing data and research in a time of global pandemic

[Header image from the New York Times Coronavirus Map, March 17th, 2020]

Just before Duke stopped travel for all faculty and staff last week, I was able to attend what will probably turn out to have been one of the last conferences of the spring in the Research Data Access and Preservation Association’s (RDAP) annual summit in Santa Fe, New Mexico. RDAP is a community of “data managers and curators, librarians, archivists, researchers, educators, students, technologists, and data scientists from academic institutions, data centers, funding agencies, and industry who represent a wide range of STEM disciplines, social sciences, and humanities,” and who are committed to creating, maintaining, and teaching best practices for the access and preservation of research data. While there were many interesting presentations and posters about the work being done in this area at various institutions around the country, the conference and RDAP’s work more broadly resonated with me in a very general and timely way, which did not necessarily stem from anything I heard during the week. 

In a situation like the global pandemic we are now facing, open and unfettered access to research data is vital for treating patients, attempting to stem the course of the disease, and potentially developing life-saving vaccines lives.

A recent editorial in Science, Translational Medicine, argues that data-driven models and centralized data sharing are the best way to approach this kind of outbreak, stating “[w]e believe that scientific efforts need to include determining the values (and ranges) of the above key variables and identifying any other important ones. In addition, information on these variables should be shared freely among the scientific and the response and resilience communities, such as the Red Cross, other nongovernmental organizations, and emergency responders” [1]. As another article points out, sharing viral samples from around the world has allowed scientists to get a better picture of the disease’s genetic makeup: “[c]omparing those genomes allowed Bedford and colleagues to piece together a viral family tree. ‘We can chart this out on the map, then, because we know that this genome is connected to this genome by these mutations,’ he said. ‘And we can learn about these transmission links'” [2].


We can chart this out on the map, then, because we know that this genome is connected to this genome by these mutations. And we can learn about these transmission links.


Scientists are also accelerating the research lifecycle by using preprint servers like arXiv, bioRxiv, and medRxiv to share their preliminary conclusions without waiting on the often glacial process of peer review. This isn’t a wholly unalloyed positive, and many preprints warrant the increased scrutiny that peer review represents. Moreover, scientific research often benefits from the kind of contextualization and unpacking that peer review and science journalism can occasionally provide. But in the acute crisis that the current outbreak presents, the rapid spread of information among scientific peer networks can undoubtedly save lives.

Continuing to develop and build the infrastructure—in terms of both technology and policy frameworks—needed to conduct the kind of data sharing we are seeing now remains a goal for the scientific community moving forward.

The Libraries, along with communities like RDAP, the Research Data Alliance, and the Data Curation Network, endorse and support this mission, and we will continue to play our role in preserving and providing persistent access to research data as best we can as we all move forward through this together. In the meantime, we hope everyone in the Duke community stays safe and healthy!

[1] Layne, S. P., Hyman, J. M., Morens, D. M., & Taubenberger, J. K. (2020, March 11). New coronavirus outbreak: Framing questions for pandemic prevention. Science Translational Medicine 12(534). https://doi.org/10.1126/scitranslmed.abb1469

[2] Sanders, L. (2020, February 13). Coronavirus’s genetic fingerprints are used to rapidly map its spread. Science News. https://www.sciencenews.org/article/coronavirus-genetic-fingerprints-are-used-to-rapidly-map-spread