Revisiting: What is the Repository? - Bitstreams: The Digital Collections Blog

Here at the Duke University Libraries we recently hosted a series of workshops that were part of a larger Research Symposium on campus. It was an opportunity for various campus agencies to talk about all of the evolving and innovative ways that they are planning for and accommodating research data. A few of my colleagues and I were asked to present on the new Research Data program that we’re rolling out in collaboration with the Duke Digital Repository, and we were happy to oblige!

I was asked to speak directly about the various software development initiatives that we have underway with the Duke Digital Repository. Since we’re in the midst of rolling out a brand new program area, we’ve got a lot of things cooking!

When I started planning for the conversation I initially thought I would talk a lot about our Fedora/Hydra stack, and the various inter-related systems that we’re planning to integrate into our repository eco-system. But what resulted from that was a lot of technical terms, and open-source software project names that didn’t mean a whole lot to anyone; especially those not embedded in the work. As a result, I took a step back and decided to focus at a higher level. I wanted to present to our faculty that we were implementing a series of software solutions that would meet their needs for accommodation of their data. This had me revisiting the age-old question: What is our Repository? And for the purposes of this conversation, it boiled down to this:

And this:

It is a highly complex, often mind-boggling set of software components, that are wrangled and tamed by a highly talented team with a diversity of skills and experience, all for the purposes of supporting Preservation, Curation, and Access of digital materials.

Those are our tenets or objectives. They are the principles that guide out work. Let’s dig in a bit on each.

Our first objection is Preservation. We want our researchers to feel 100% confident that when they give us their data, that we are preserving the integrity, longevity, and persistence of their data.

Our second objective is to support Curation. We aim to do that by providing software solutions that facilitate management and description of file sets, and logical arrangement of complex data sets. This piece is critically important because the data cannot be optimized without solid description and modeling that informs on its purpose, intended use, and to facilitate discovery of the materials for use.

Finally our work, our software, aims to facilitate discovery & access. We do this by architecture thoughtful solutions that optimize metadata and modeling, we build out features that enhance the consumption and usability of different format types, we tweak, refine and optimize our code to enhance performance and user experience.

The repository is a complex beast. It’s a software stack, and an eco-system of components. It’s Fedora. It’s Hydra. It’s a whole lot of other project names that are equally attractive and mystifying. At it’s core though, it’s a software initiative- one that seeks to serve up an eco-system of components with optimal functionality that meet the needs and desires of our programmatic stakeholders- our University.

Preservation, Curation, & Access are the heart of it.

Notes from the Duke University Libraries Digital Projects Team