Tag Archives: Duke Digital Repository

Blacklight Summit 2016

Last week I traveled to lovely Princeton, NJ to attend Blacklight Summit. For the second year in a row a smallish group of developers who use or work on Project Blacklight met to talk about our work and learn from each other.

Blacklight is an open source project written in Ruby on Rails that serves as a discovery interface over a Lucene Solr search index. It’s commonly used to build library catalogs, but is generally agnostic about the source and type of the data you want to search. It was even used to help reporters explore the leaked Panama Papers.
blacklight-logo-h200-transparent-black-text
At Duke we’re using Blacklight as the public interface to our digital repository. Metadata about repository objects are indexed in Solr and we use Blacklight (with a lot of customizations) to provide access to digital collections, including images, audio, and video. Some of the collections include: Gary Monroe Photographs, J. Walter Thompson Ford Advertisements, and Duke Chapel Recordings, among many others.

Blacklight has also been selected to replace the aging Endeca based catalog that provides search across the TRLN libraries. Expect to hear more information about this project in the future.
trln_logo_abbrev_rgb
Blacklight Summit is more of an unconference meeting than a conference, with a relatively small number of participants. It’s a great chance to learn and talk about common problems and interests with library developers from other institutions.

I’m going to give a brief overview of some of what we talked about and did during the two and a half day meeting and provides links for you explore more on your own.

First, a representative from each institution gave about a five minute overview of how they’re using Blacklight:

The group participated in a workshop on customizing Blacklight. The organizers paired people based on experience, so the most experienced and least experienced (self-identified) were paired up, and so on. Links to the github project for the workshop: https://github.com/projectblacklight/blacklight_summit_demo

We got an update on the state of Blacklight 7. Some of the highlights of what’s coming:

  • Move to Bootstrap 4 from Bootstrap 3
  • Use of HTML 5 structural elements
  • Better internationalization support
  • Move from helpers to presenters. (What are presenters: http://nithinbekal.com/posts/rails-presenters/)
  • Improved code quality
  • Partial structure that makes overrides easier

A release of Blacklight 7 won’t be ready until Bootstrap 4 is released.

There were also several conversations and breakout session about Solr, the indexing tool used to power Blacklight. I won’t go into great detail here, but some topics discussed included:

  • Developing a common Solr schema for library catalogs.
  • Tuning the performance of Solr when the index is updated frequently. (Items that are checkout out or returned need to be indexed relatively frequently to keep availability information up to date.)
  • Support for multi-lingual indexing and searching in Solr, especially Chinese, Japanese, and Korean languages. Stanford has done a lot of work on this.

I’m sure you’ll be hearing more from me about Blacklight on this blog, especially as we work to build a new TRLN shared catalog with it.

Open Source Software and Repository land

The Duke University Libraries software development team just recently returned from a week in Boston, MA at a conference called Hydra Connect.  We ate good seafood, admired beautiful cobblestones, strolled along the Charles River, and learned a ton about what’s going on in the Hydra-sphere.

At this point you may be scratching your head, exclaiming- huh?!  Hydra?  Hydrasphere?  Have no fear, I shall explain!

hydra_logo_ahead_captioned_realigned

Our repository, the Duke Digital Repository, is a Hydra/Fedora Repository.  Hydra and Fedora are names for two prominent open-source communities in repository land.  Fedora concerns itself with architecting the back-end of a repository- the storage layer.  Hydra, on the other hand, refers to a multitude of end-user applications that one can architect on top of a Fedora repository to perform digital asset management.  Pretty cool and pretty handy.  Especially for someone that has no interest in architecting a repository from scratch.

And for a little context re: open source… the idea is that a community of like-minded individuals that care about a particular thing, will band together to develop a massively cool software product that meets a defined need, is supported and extended by the community, and is offered for free for someone to inspect, modify and/or enhance the source code.

3ea640b

I italicized ‘free’ to emphasize that while the software itself is free, and while the source code is available for download and modification it does take a certain suite of skills to architect a Hydra/Fedora Repository.  It’s not currently an out-of-the-box solutions, but is moving in that direction with Hydra-in-a-Box.  But I digress…

So.  Why might someone be interested in joining an open-source community such as these?  Well, for many reasons, some of which might ring true for you:

  • Resources are thin.  Talented developers are hard to find and harder to recruit.  Working with an open source community means that 1) you have the source code to get started, 2) you have a community of people that are available (and generally enthusiastic) about being a resource, and 3) working collaboratively makes everything better.  No one wants to go it alone.
  • Governance.  If one gets truly involved at the community level there are often opportunities for contributing thoughts and opinion that can help to shape and guide the software product.  That’s super important when you want to get invested in a project and ensure that it fully meets you need.  Going it alone is never a good option, and the whole idea of open-source is that it’s participatory, collaborative, and engaged.
  • Give back.  Perhaps you have a great idea.  A fantastic use case.  Perhaps one that could benefit a whole lot of other people and/or institutions.  Well then share the love by participating in open-source.  Instead of developing a behemoth locally that is not maintainable, contribute ideas or features or a new product back to the community.  It benefits others, and it benefits you, by investing the community in the effort of folding features and enhancements back into the core.

Hydra Connect was a fantastic opportunity to mingle with like-minded professionals doing very similar work, and all really enthusiastic to share their efforts.  They want you to get excited about their work.  To see how they are participating in the community.  How they are using this variety of open-source software solutions in new and innovative ways.

It’s easy to get bogged down at a local level with the micro details, and to lose the big picture.  It was refreshing to step out of the office and get back into the frame of mind that recognizes and empowers the notion that there is a lot of power in participating in healthy communities of practice.  There is also a lot of economy in it.

The team came back to Durham full of great ideas and a lot of enthusiasm.  It has fueled a lot of fantastic discussion about the future of our repository software eco-system and how that complements our desire to focus on integration, community developed goodness, and sustainable practices for software development.

More to come as we turn that thought process into practice!

img_0748
Team Hydra Connect 2016

Project Hydra

Hydra Connect 2016

Research at Duke and the future of the DDR

The Duke Digital Repository (DDR) is a growing service, and the Libraries are growing to support it. As I post this entry, our jobs page shows three new positions comprising five separate openings that will support the DDR. One is a DevOps position which we have re-envisioned from a salary line that opened with a staff member’s departure. The other four consist of two new positions, with two openings for each, created to meet specific, emerging needs for supporting research data at Duke.

Last fall at Duke, the Vice Provosts for Research and the Vice President for Information Technology convened a Digital Research Faculty Working Group. It included a number of faculty members from around campus, as well as several IT administrators, the latter of whom served in an ex-officio capacity. The Libraries were represented by our Associate University Librarian for Information Technology, Tim McGeary (who happens to be my supervisor).

Membership of the Digital Research Faculty Group. This image and others in the post are slides taken from a presentation I gave to the Libraries’ all-staff meeting in August.

Continue reading Research at Duke and the future of the DDR

Developing the Duke Digital Repository is Messy Business

Let me tell you something people: Coordinating development of the Duke Digital Repository (DDR) is a crazy logistical affair that involves much ado about… well, everything!

My last post, What is a Repository?, discussed at a high level, what exactly a digital repository is intended to be and the purpose it plays in the Libraries’ digital ecosystem.  If we take a step down from that, we can categorize the DDR as two distinct efforts, 1) a massive software development project and 2) a complex service suite.  Both require significant project management and leadership, and necessitate tools to help in coordinating the effort.

There are many, many details that require documenting and tracking through the life cycle of a software development project.  Initially we start with requirements- meaning what the tools need to do to meet the end-users needs.  Requirements must be properly documented and must essentially detail a project management plan that can result in a successful product (the software) and the project (the process, and everything that supports success of the product itself).  From this we manage a ‘backlog’ of requirements, and pull from the backlog to structure our work.  Requirements evolve into tasks that are handed off to developers.  Tasks themselves become conversations as the development team determines the best possible approach to getting the work done.  In addition to this, there are bugs to track, changes to document, and new requirements evolving all of the time… you can imagine that managing all of this in a simple ‘To Do’ list could get a bit unwieldy.

overwhelmed-stickynotes-manager

We realized that our ability to keep all of these many plates spinning necessitated a really solid project management tool.  So we embarked on a mission to find just the right one!  I’ll share our approach here, in case you and your team have a similar need and could benefit from our experiences.

STEP 1: Establish your business case:  Finding the right tool will take effort, and getting buy-in from your team and organization will take even more!  Get started early with justifying to your team and your org why a PM tool is necessary to support the work.

STEP 2: Perform a needs assessment: You and your team should get around a table and brainstorm.  Ask yourselves what you need this tool to do, what features are critical, what your budget is, etc.  Create a matrix where you fully define all of these characteristics to drive your investigation.

STEP 3: Do an environmental scan: What is out there on the market?  Do your research and whittle down a list of tools that have potential.  Also build on the skills of your team- if you have existing competencies in a given tool, then fully flesh out its features to see if it fits the bill.

STEP 4:  Put them through the paces: Choose a select list of tools and see how they match up to you needs assessment.  Task a group of people to test-drive the tools, and report out on the experience.

STEP 5: Share your findings: Discuss the findings with your team.  Capture the highs and the lows and present the material in a digestible fashion.  If it’s possible to get consensus, make a recommendation.

STEP 6: Get buy-in: This is the MOST critical part!  Get buy-in from your team to implement the tool.  A PM tool can only benefit the team if it is used thoroughly, consistently, and in a team fashion.  You don’t want to deal with adverse reactions to the tool after the fact…

project-management

No matter what tool you choose, you’ll need to follow some simple guidelines to ensure successful adoption:

  • Once again… Get TEAM buy-in!
  • Define ownership, or an Admin, of the tool (ideally the Project Manager)
  • Define basic parameters for use and team expectations
  • PROVIDE TRAINING
  • Consider your ecosystem of tools and simplify where appropriate
  • The more robust the tool, the more support and structure will be required

Trust me when I say that this exercise will not let you down, and will likely yield a wealth of information about the tools that you use, the projects that you manage, your team’s preferences for coordinating the work, and much more!

Nobody Wants a Slow Repository

As we’ve been adding features and refining the public interface to Duke’s Digital Repository, the application has become increasingly slow. Don’t worry, the very slowest versions were never deployed beyond our development servers. This blog post is about how I approached addressing the application’s performance problems before they made their way to our production site.

14729168562_ecc30e44d8_b
A modern web application, like the public interface to Duke’s Digital Repository, is a complex beast, relying on layers of software and services just to deliver a bunch of HTML, CSS, and JavaScript to your web browser. A page like this, the front page to the Alex Harris collection takes a lot to build — code to read configuration files, methods that assemble information needed to build the page, requests to Solr to find the images to display, requests to a separate administrative application service that provides contact information for the collection, another request to fetch related blog posts, and requests to our finding aid application to deliver information about the physical collection. All of these requests take time and all of them have to finish before anything gets delivered to your browser.

My main suspects for the slowness: HTTP requests to external services, such as the ones mentioned above; and repeated calls to slow methods in the application. But identifying precisely which HTTP requests are slow and what code needs to be optimized takes a bit of sleuthing.

The first thing I wanted to know was: how slow is this thing, really? Turns out it was getting getting really slow. Too slow. There’s old research (1960s old) about computer system performance and its impact on user perception and task performance that still applies today. This also old (1993 old) article from the Nielsen Norman Group summarizes the issue nicely.

To determine just how slow things were getting I used Chrome’s developer tools. The “Network” tab in Chrome’s developer tools is where the hard truth comes to light about just how bloated and slow your web application is. Or, as my high school teachers used to say when handing back test results: “read ’em and weep.”

network-panel-dev-tools

By using the Network tab in Browser Tools I was able to see that the browser was having to wait 15 or more seconds for anything to come back from the server. This is too slow.

The next thing I wanted to know was how many HTTP requests were being made to external services and which ones were being made repeatedly or were taking a long time. For this dose of reality I used the httplog gem, which logs useful information about every HTTP request, including how long the application has to wait for a response.

When added to the project’s Gemfile, httplog starts printing out useful information to the log about HTTP requests, such as this set of entries about the request to fetch finding aid information. I can see that the application is waiting over half a second to get a response back from the finding aid service:


D, [2016-08-06T12:51:09.531076 #2529] DEBUG -- : [httplog] Connecting: library.duke.edu:80
D, [2016-08-06T12:51:09.854003 #2529] DEBUG -- : [httplog] Sending: GET http://library.duke.edu:80/rubenstein/findingaids/harrisalex.xml
D, [2016-08-06T12:51:09.855387 #2529] DEBUG -- : [httplog] Data:
D, [2016-08-06T12:51:10.376456 #2529] DEBUG -- : [httplog] Status: 200
D, [2016-08-06T12:51:10.377061 #2529] DEBUG -- : [httplog] Benchmark: 0.520600972 seconds

As I expected, this request and many others were contributing significantly to the application’s slowness.

It was a bit harder to determine which parts of the code and which methods were also making the application slow. For this, I mainly used two approaches. The first was to look at the application logs which tracks how long different views take to assemble. This helped narrow down which parts of the code were especially slow (and also confirmed what I was seeing with httplog). For instance in the log I can see different partials that make up the whole page and how long each of them takes to assemble. From the log:


12:51:09 INFO: Rendered digital_collections/_home_featured_collections.html.erb (0.8ms)
12:51:09 INFO: Rendered digital_collections/_home_highlights.html.erb (1.3ms)
12:51:10 INFO: Rendered catalog/_show_finding_aid_full.html.erb (953.4ms)
12:51:11 INFO: Rendered catalog/_show_blog_post_feature.html.erb (0.9ms)
12:51:11 INFO: Rendered catalog/_show_blog_posts.html.erb (914.5ms)

(The finding aid and blog posts are slow due to the aforementioned HTTP requests.)

widget2

One particular area of concern was extremely slow searches. To identify the problem I turned to yet another tool. Rack-mini-profiler is a gem that when added to your project’s Gemfile adds an expandable tab on every page of the site. When you visit pages of the application in a browser it displays a detailed report of how long it takes to build each section of the page. This made it possible to narrow down areas of the application that were too slow.

search_results

What I found was that the thumbnail section of the page — which can appear up to twenty times or more on a search result page was very slow. And it wasn’t loading the images that was slow but running the code to select the correct thumbnail image took a long time to run. (Thumbnail selection is complicated in the repository because there are various types and sources for thumbnails.)

Having identified several contributors to the site’s poor performance (expensive thumbnail selection, and frequent and costly HTTP requests to various services) I could now work to address each of the issues.

I used three different approaches to improving the application’s performance: fragment caching, memoization, and code optimization.

Caching

finding_aid

I decided to use fragment caching to address the slow loading of finding aid information. The benefit of caching is that it’s really fast. Once Rails has the snippet of HTML cached (either in memory or on disk, depending on how it’s configured) it can use that fragment of cached markup, bypassing a lot of code and, in this case, that slow HTTP request. One downside to caching is that if something in the finding aid changes the application won’t reflect the change until the cache is cleared or expires (after 7 days in this case).


<% cache("finding_aid_brief_#{document.ead_id}", expires_in: 7.days) do %>
<%= source_collection({ :document => document, :placement => 'left' }) %>
<% end %>

Memoization

Memoization is similar to caching in that you’re storing information to be used repeatedly rather then recalculated every time. This can be a useful technique to use with expensive (slow) methods that get called frequently. The parent_collection_count method returns the total number of collections in a portal in the repository (such as the Digital Collections portal). This method is somewhat expensive because it first has to run a query to get information about all of the collections and then count them. Since this gets used more than once, I’m using Ruby’s conditional assignment operator (||=) to tell Ruby not to recalculate the value of @parent_collection_count every time the method is called. With memoization, if the value is already stored Ruby just reuses the previously calculated value. (There are some gotchas with this technique, but it’s very useful in the right circumstances.)


def parent_collections_count
@parent_collections_count ||= response(parent_collections_search).total
end

Code Optimization

One of the reasons thumbnails were slow to load in search results is that some items in the repository have hundreds of images. The method used to find the thumbnail path was loading image path information for all the item’s images rather than just the first one. To address this I wrote a new method that fetches just the item’s first image to use as the item’s thumbnail.

Combined, these changes made a significant improvement to the site’s performance. Overall application speed and performance will remain one of our priorities as we add features to the Duke Digital Repository.

What is a Repository?

We’ve been talking a lot about the Repository of late, so I thought it might be time to come full circle and make sure we’re all on the same page here…. What exactly is a Repository?

A Repository is essentially a digital shelf.  A really, really smart shelf!

It’s the place to safely and securely store digital assets of a wide variety of types for preservation, discovery, and use, though not all materials in the repository may be discoverable or accessible by everyone.  So, it’s like a shelf.  Except that this shelf is designed to help us preserve these materials and try to ensure they’ll be usable for decades.  

bookshelf-organization

This shelf tells us if the materials on it have changed in any way.  They tell us when the materials don’t conform to the format specification that describes exactly how a file format is to be represented.  These shelves have very specific permissions, a well thought out backup procedure to several corners of the country, a built-in versioning system to allow us to migrate endangered or extinct formats to new, shiny formats, and a bunch of other neat stuff.

The repository is the manifestation of a conviction about the importance of an enduring scholarly record and open and free access to Duke scholarship.  It is where we do our best to carve our knowledge in stone for future generations.  

Why? is perhaps the most important question of all.  There are several approaches to Why?  National funding agencies (NIH, NSF, NEH, etc) recognize that science is precariously balanced on shoddy data management practices and increasingly require researchers to deposit their data with a reputable repository.  Scholars would like to preserve their work, make it accessible to everyone (not just those who can afford outrageously priced journal subscriptions), and want to increase the reach and impact of their work by providing stable and citable DOIs.  

Students want to be able to cite their own thesis, dissertations, and capstone papers and to have others discover and cite them.  The Library wants to safeguard its investment in digitization of Special Collections.  Archives needs a place to securely store university records.

huge.6.33561

A Repository, specifically our Duke Digital Repository, is the place to preserve our valuable scholarly output for many years to come.  It ensures disaster recovery, facilitates access to knowledge, and connects you with an ecosystem of knowledge.

Pretty cool, huh?!

Repository Mega-Migration Update

We are shouting it from the roof tops: The migration from Fedora 3 to Fedora 4 is complete!  And Digital Repository Services are not the only ones relieved.  We appreciate the understanding that our colleagues and users have shown as they’ve been inconvenienced while we’ve built a more resilient, more durable, more sustainable preservation platform in which to store and share our digital assets.

shouting_from_the_rooftops

We began the migration of data from Fedora 3 on Monday, May 23rd.  In this time we’ve migrated roughly 337,000 objects in the Duke Digital Repository.  The data migration was split into several phases.  In case you’re interested, here are the details:

  1. Collections were identified for migration beginning with unpublished collections, which comprise about 70% of the materials in the repository
  2. Collections to be migrated were locked for editing in the Fedora 3 repository to prevent changes that inadvertently won’t be migrated to the new repository
  3. Collections to be migrated were passed to 10 migration processors for actual ingest into Fedora 4
    • Objects were migrated first.  This includes the collection object, content objects, item objects, color targets for digital imaging, and attachments (objects related to, but not part of, a collection like deposit agreements
    • Then relationships between objects were migrated
    • Last, metadata was migrated
  4. Collections were then validated in Fedora 4
  5. When validation is complete, collections will be unlocked for editing in Fedora 4

Presto!  Voila!  That’s it!

MV5BMTEwNjMwMjc3MDdeQTJeQWpwZ15BbWU4MDg0OTA4MDIx._V1_UX182_CR0,0,182,268_AL_

While our customized version of the Fedora migrate gem does some validation of migrated content, we’ve elected to build an independent process to provide validation.  Some of the validation is straightforward such as comparing checksums of Fedora 3 files against those in Fedora 4.  In other cases, being confident that we’ve migrated everything accurately can be much more difficult. In Fedora 3, we can compare checksums of metadata files while in Fedora 4 object metadata is stored opaquely in a database without checksums that can be compared.  The short of it is that we’re working hard to prove successful migration of all of our content and it’s harder than it looks.  It’s kind of like insurance- protecting us from the risk of lost or improperly migrated data.

We’re in the final phases of spiffing up the Fedora 4 Digital Repository user interface, which is scheduled to be deployed the week of July 11th.  That release will not include any significant design changes, but is simply compatible with the new Fedora 4 code base.  We are planning to release enhancements to our Data & Visualizations collection, and are prioritizing work on the homepage of the Duke Digital Repository… you will likely see an update on that coming up in a subsequent blog post!

Preservation Architecture: Phase 2 – Moving Forward with Duke Digital Repository

 

DukeSpace circa 2013
DukeSpace circa 2013

 

In 2013, the average price for a gallon of gas was $3.80, President Obama was inaugurated for a second term, and Duke University Libraries offered DukeSpace as an institutional repository.  Some things haven’t changed much, but the preservation architecture protecting the digital materials curated by the Libraries has changed a lot!

We still provide DukeSpace, but are laying the foundation to migrate collections and processes to the Duke Digital Repository (DDR).  The DDR was conceived of and developed as a digital preservation repository, an environment intended to preserve and sustain the rich digital collections; university scholarship and research data; purchased collections, and history of Duke far into the future.  Only through the grace of our partnership with Digital Projects and Production Services has the DDR recently also become a site that no longer hurts the eyes of our visitors.

The Duke Digital Repository endeavors to protect our assets from a large and diverse threat model. There are threats that are not addressed in the systems model presented here, such as those identified in the SPOT Model for Risk Assessment, of course. We formally consider these baseline threats to include:

  • Natural disasters including accidents at our local nuclear power station, fire, and hurricanes
  • Data degradation also known as bit rot or bit decay
  • External actors or threats posed by people external to the DDR team including those who manage our infrastructure
  • Internal actors including intentional or unintentional security risks and exploits by privileged staff in the libraries and supporting IT organizations

Phase 1 of our ingress into digital preservation established that DSpace, the software powering DukeSpace, was not sufficient for our needs, which led to an environmental scan and pilot project with Fedora and then Fedora and Hydra. This provided us with some of the infrastructure to mitigate the threats we had identified, but not all.  In Phase 1 we were to perform some important preservation tasks including:

  • Prove authenticity by offering checksum fixity validation on ingest and periodically
  • Identify and report on data degradation
  • Capture context in the form of descriptive, administrative, and technical metadata
  • Identify files in need of remediation using file characterization tools

Phase 2 allows us to address a greater range of threats and therefore offer a higher level of security to our collections.  In Phase 2 we’re doing several concurrent migrations including migrating our archival storage to infrastructure that will allow for dynamic resizing, de-duplication, and block-level integrity checking; moving to a horizontally scaled server architecture to allow the repository to grow to meet increasing demands of size (individual file size and size of collection) and traffic; and adopting a cloud replication disaster recovery process using DuraCloud to replace our local-only disk/tape infrastructure.  These changes provide significant protection against our baseline threat model by providing geographic diversity to our replicas, allowing us to constantly monitor the health of our 3 cloud replicas, and providing administrative diversity to the management of our replicas ensuring no single threat may corrupt all 4 copies of our data.

More detail about the repository architecture to come.

 

Looking to the Future of the Duke Digital Repository: Defining a Program for Digital Preservation, Management & Access

Our modern day lives and professional endeavors are teeming with digital output.  We participate in the digital ecosystem every day, contributing our activities, our scholarship, and our work in new and evolving ways.  Some of that contribution gets lost in the Internet ether, and some gets saved, or preserved, in specific, often localized ways that are neither sustainable nor preservable for the long haul.  We here at the Duke University Libraries, want to be able to look to the future with confidence, knowing that we have a game plan for capturing and preserving digital objects that are necessary and vital to the university community.  Queue the new Duke Digital Repository.  

DDR

The Duke Digital Repository is a software development initiative undertaken by the Digital Repository Services department in the Duke University Libraries.  It is a preservation repository architected using the Fedora Open Source software project, which is intended to replace the current manifestation of our institutional repository, Duke Space.  It is a superior product that is provisioned specifically for the preservation, storage, and access of digital objects.  The Duke Digital Repository is fully operational; we are now in the process of refining user interfaces, ingesting new and varied collections, and assessing descriptive metadata needs for ingested collections.  

fed

So what’s next?  Well we’ve got the Duke Digital Repository as a platform, now we need the Duke Digital Repository as a program.  We need to clarify the services and support that we offer to the university community, we need to fully define its stakeholders, and we need to implement an organizational structure to support a robust service.  

Here are just a few things that we’re engaged in that are seeking to define our user groups and assess their needs in a preservation platform and digital support service.  Defining these expectations will allow us to take the next step in crafting a sustainable and relevant program to support the digital scholarship of the university.

  • ITHAKA Faculty Survey: In the Fall semester of 2015, the Libraries deployed the ITHAKA S+R Faculty Survey.  Faculty are considered a primary stakeholder of the repository, as it is well provisioned to meet their data management needs.  260 faculty members responded to the survey, sharing their thoughts on a variety of topics including scholarly communications services, research practices, data preservation and management needs, and much more.  There was a lot of valuable, actionable data contributed, which pertains directly to the repository as a preservation tool, and a service for data support.  The digital repository team is working through this data to identify and target needs and desires in a repository program.
  • Graduate & Undergraduate Advisory Boards:  The Digital Repository staff are also working with the Assessment & User Experience team within the library to reach out to graduate and undergraduate student constituents to capture their voice.  We have collectively identified a list of questions and prompts that will engage them in a discussion about their needs pertaining to the repository as a tool and a service.  From this discussion we are also gauging their understanding of ‘a repository’ and hoping to glean some information that will help us to understand how we might brand and market the repository more effectively.  
  • Fedora Community: Fedora is an open source software product developed and stewarded by the DuraSpace community.  The Duke University Libraries are active participants in the community which is essentially a consortium of academic institutions that are working toward a common goal of preserving intellectual, cultural, and scientific heritage.  We are reaching out to our community constituents to ask how other institutions similar to ours are supporting their repository  programs.  We’re assessing  various models of support and generating a discussion around repository support as a resourced program, rather than a simple software solution.  We are also working with Assessment & User Experience to conduct an environmental scan and literature review to gain greater insight and understanding of best practice.

special

In short, we want to make the repository special, and relevant to its users.  We want to feel confident that it provides a service that is valuable and necessary for our university community.  We invite your feedback as we embark on this effort.  For further information or to give us your feedback, please contact us.