Category Archives: Uncategorized

Squirlicorn, spirit guide of the digital repository: Four things you should know

One thing I’ve learned on my life’s journey is the importance of knowing your spirit guide.

That’s why, by far the most important point that I made in a talk at the TRLN Annual Meeting in July is that the spirit guide of the digital repository movement is the squirlicorn.

Continue reading Squirlicorn, spirit guide of the digital repository: Four things you should know

The Inaugural TRLN Institute – an Experiment in Consortial Collaboration

In June of this year I was fortunate to have participated in the inaugural TRLN Institute. Modeled as a sort of Scholarly Communication Institute for TRLN (Triangle Research Libraries Network, a consortium located in the Triangle region of North Carolina), the Institute provided space (the magnificent Hunt Library on North Carolina State University’s campus), time (three full days), and food (Breakfast! Lunch! Coffee!) for groups of 4-6 people from member libraries to get together to exclusively focus on developing innovative solutions to shared problems. Not only was it productive, it was truly delightful to spend time with colleagues from member institutions who, although we are geographically close, don’t get together often enough.

Six projects were chosen from a pool of applicants who proposed topics around this year’s theme of Scholarly Communication:

  • Supporting Scholarly Communications in Libraries through Project Management Best Practices
  • Locating Research Data in an Age of Open Access
  • Clarifying Rights and Maximizing Reuse with RightsStatements.org
  • Building a Research Data Community of Practice in NC
  • Building the 21st Century Researcher Brand
  • Scholarship in the Sandbox: Showcasing Student Works

You can read descriptions of the projects as well as group membership here.

The 2017 TRLN Institute participants and organizers, a happy bunch.

Having this much dedicated and unencumbered time to thoughtfully and intentionally address a problem area with colleagues was invaluable. And the open schedule allowed groups to be flexible as their ideas and expectations changed throughout the course of the three-day program. My own group – Clarifying Rights and Maximizing Reuse with RightsStatements.org – was originally focused on developing practices for the application and representation of RightsStatements.org statements for TRLN libraries’ online digitized collections. Through talking as a group, however, we realized early on that some of the stickiest issues regarding the implementation of a new rights management strategy involves the work an institution has to do to identify appropriate staff to do the work, allocate resources, plan, and document the process.

So, we pivoted! Instead of developing a decision matrix for applying the RS.org statements in digital collections (which is what we originally thought our output would be), we instead spent our time drafting a report – a roadmap of sorts – that describes the following important components when implementing RightsStatements.org:

  • roles and responsibilities (including questions that a person in a role would need to ask)
  • necessary planning and documentation
  • technical decisions
  • example implementations (including steps taken and staff involved – perhaps the most useful section of the report)

This week, we put the finishing touches on our report: TRLN Rights Statements Report – A Roadmap for Implementing RightsStatements.org Statements (yep, yet another google doc).  We’re excited to get feedback from the community, as well as hear about how other institutions are handling rights management metadata, especially as it relates to upstream archival information management. This is an area rife for future exploration!

I’d say that the first TRLN Institute was a success. I can’t imagine my group having self-organized and produced a document in just over a month without having first had three days to work together in the same space and unencumbered by other responsibilities. I think other groups have found valuable traction via the Institute as well, which will result in more collaborative efforts. I look forward to seeing what future TRLN Institute produce – this is definitely a model to continue!

Pink Squirrel: It really is the nuts

During the last 8 months that I’ve worked at Duke, I’ve noticed a lot of squirrels. They seem to be everywhere on this campus, and, not only that, they come closer than any squirrels that I’ve ever seen. In fact, while working outside yesterday, and squirrel hopped onto our table and tried to take an apple from us. It’s become a bit of a joke in my department, actually. We take every opportunity we can to make a squirrel reference.

Anyhow, since we talk about squirrels so often, I decided I’d run a search in our digital collections to see what I’d get. The only image returned was the billboard above, but I was pretty happy with it. In fact, I was so happy with it that I used this very image in my last blog post. At the time, though, I was writing about what my colleagues and I had been doing in regards to the new research data initiative since the beginning of 2017, so I simply used it as a visual to make my coworkers laugh. However, I reminded myself to revisit and investigate. Plus, although I bartended for many years during grad school, I’d never made (much less heard of) a Pink Squirrel cocktail. Drawing inspiration from our friends in Rubenstein Library that write for “The Devil’s Tales” in the “Rubenstein Library Test Kitchen” category, I thought I’d not only write about what I learned, but also try to recreate it.

This item comes from the “Outdoor Advertising Association of America (OAAA) Archives, 1885-1990s” digital collection, which includes over 16,000 images of outdoor advertisements and other scenes. It is one of a few digital outdoor advertising collections that we have, as were previously written about here.

This digital collection houses 6 Glenmore Distilleries Company billboard images in total. 2 are for liquors (a bourbon and a gin), and 4 are for “ready-to-pour” Glenmore cocktails.

These signs indicate that Glenmore Distilleries Company created a total of 14 ready-to-pour cocktails. I found a New York Times article from August 19, 1965 in our catalog stating that Glenmore Distilleries Co. had expanded its line to 18 drinks, which means that the billboards in our collection have to pre-date 1965. Its president, Frank Thompson Jr., was quoted as saying that he expected “exotic drinks” to account for any future surge in sales of bottled cocktails.

OK, so I learned that Glenmore Distilleries had bottled a drink called a Pink Squirrel sometime before 1965. Next, I needed to research to figure out about the Pink Squirrel. Had Glenmore created it? What was in it? Why was it PINK?

It appears the Pink Squirrel was quite popular in its day and has risen and fallen in the decades since. I couldn’t find a definitive academic source, but if one trusts Wikipedia, the Pink Squirrel was first created at Bryant’s Cocktail Lounge in Milwaukee, Wisconsin. The establishment still exists, and its website states the original bartender, Bryant Sharp, is credited with inventing the Pink Squirrel (also the Blue Tail Fly and the Banshee, if you’re interested in cocktails). Wikipedia lists 15 popular culture references for the drink, many from 90s sitcoms (I’m a child of the 80s but don’t remember this) and other more current references. I also found an online source saying it was popular on the New York cocktail scene in the late 70s and early 80s (maybe?). Our Duke catalog returns some results, as well, including articles from Saveur (2014), New York Times Magazine (2006), Restaurant Hospitality (1990), and Cosmopolitan (1981). These are mostly variations on the recipe, including cocktails made with cream, a cocktail made with ice cream (Saveur says “blender drinks” are a cherished tradition in Wisconsin), a pie(!), and a cheesecake(!!).

Armed with recipes for the cream-based and the ice cream-based cocktails, I figured I was all set to shop for ingredients and make the drinks. However, I quickly discovered that one of the three ingredients, crème de noyaux, is a liqueur that is not made in large quantities by many companies anymore, and proved impossible to find around the Triangle. However, it’s an important ingredient in this drink, not only for its nutty flavor, but also because it’s what gives it its pink hue (and obviously its name!). Determined to make this work, I decided to search to see if I could come up with a good enough alternative. I started with the Duke catalog, as all good library folk do, but with very little luck, I turned back to Google. This led me to another Wikipedia article for crème de noyaux, which suggested substituting Amaretto and some red food coloring. It also directed me to an interesting blog about none other than crème de noyaux, the Pink Squirrel, Bryant’s Cocktail Lounge, and a recipe from 1910 on how to make crème de noyaux. However, with time against me, I chose to sub Amaretto and red food coloring instead of making the 1910 homemade version.

First up was the cream based cocktail. The drink contains 1.5 ounces of heavy cream, .75 ounces of white crème de cacao, and .75 ounces of crème de noyaux (or Amaretto with a drop of red food coloring), and is served up in a martini glass.

The result was a creamy, chocolatey flavor with a slight nuttiness, and just enough sweetness without being overbearing. The ice cream version substitutes the heavy cream for a half a cup of vanilla ice cream and is blended rather than shaken. It had a thicker consistency and was much sweeter. My fellow taster and I definitely preferred the cream version. In fact, don’t be surprised if you see me around with a pink martini in hand sometime in the near future.

On TRAC: Assessment Tools and Trustworthiness

Duke Digital Repository is, among other things, a digital preservation platform and the locus of much of our work in that area.  As such, we often ponder the big questions:

  1. What is the repository?
  2. What is digital preservation?
  3. How are we doing?

 

 

 

What is the repository?

Fortunately, Ginny gave us a good start on defining the repository in Revisiting: What is the Repository?  It’s software, hardware, and  collaboration.  It’s processes, policies, attention, and intention.  While digital preservation is one of the focuses of the repository, digital preservation extends beyond the repository and should far outlive the repository.

What is digital preservation?

There are scores of definitions, but this Medium Definition from ALCTS is representative:

Digital preservation combines policies, strategies and actions to ensure access to reformatted and born digital content regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time.

This is the short answer to the question: Accurate rendering of authenticated digital content over time.  This is the motivation behind the work described in Preservation Architecture: Phase 2 – Moving Forward with Duke Digital Repository.

How are we doing?

There are 2 basic methodologies for assessing this work- reactive and proactive.  A reactive approach to digital preservation might be characterized by “Hey!  We haven’t lost anything yet!”, which is why we like the proactive approach.

Digital preservation can be be a pretty deep rabbit hole and it can be an expensive proposition to attempt to mitigate the long tail of risk.  Fortunately, the community of practice has developed tools to assist in the planning and execution of trustworthy repositories.  At Duke, we’ve got several years experience working in the framework of the Center for Research Libraries’ Trustworthy Repositories Audit & Certification: Criteria and Checklist (TRAC) as the primary assessment tool by which we measure our efforts.  Much of the work to document our preservation environment and the supporting institutional commitment was focused on our DSpace repository, DukeSpace.  A great deal has changed in the recent 3 years including significant growth in our team and scope.  So, once again we’re working to measure ourselves against the standards of our profession and to use that process to inform our work.

There are 3 areas of focus in TRAC: Organizational Infrastructure, Digital Object Management, and Technologies, Technical Infrastructure, & Security.  These cover a very wide and deep field and include things like:

  • Securing Service Level of Agreements for all service providers
  • Documenting the organizational commitments of both Duke University and Duke University Libraries and sustainability plans relating to the repository
  • Creating and implementing routine testing of backup, remote replication, and restoration of data and relevant infrastructure
  • Creating and approving documentation on a wide variety of subjects for internal and external audiences

 

Back to the question: How are we doing?

Well, we’re making progress!  Naturally we’re starting with ensuring the basic needs are met first- successfully preserving the bits, maximizing transparency and external validation that we’re not losing the bits, and working on a sustainable, scalable architecture.  We have a lot of work ahead of us, of course.  The boxes in the illustration are all the same size, but the work they represent is not.  For example, the Disaster Recovery Plan at Hathi Trust is 61 pages of highly detailed thoughtfulness.  However, these works build on each other so we’re confident that the work we’re doing on the supporting bodies of policy, procedure, and documentation will make ease the work to a complete Disaster Recovery Plan.

The ABCs of Digitizing Section A

I’m not sure anyone who currently works in the library has any idea when the phrase “Section A” was first coined as a call number for small manuscript collections. Before the library’s renovation, before we barcoded all our books and boxes — back when the Rubenstein was still RBMSCL, and our reading room carpet was a very bright blue — there was a range of boxes holding single-folder manuscript collections, arranged alphabetically by collection creator. And this range was called Section A.

Box 175 of Section A
Box 175 of Section A

Presumably there used to be a Section B, Section C, and so on — and it could be that the old shelf ranges were tracked this way, I’m not sure — but the only one that has persisted through all our subsequent stacks moves and barcoding projects has been Section A. Today there are about 3900 small collections held in 175 boxes that make up the Section A call number. We continue to add new single-folder collections to this call number, although thanks to the miracle of barcodes in the catalog, we no longer have to shift files to keep things in perfect alphabetical order. The collections themselves have no relationship to one another except that they are all small. Each collection has a distinct provenance, and the range of topics and time periods is enormous — we have everything from the 17th to the 21st century filed in Section A boxes. Small manuscript collections can also contain a variety of formats: correspondence, writings, receipts, diaries or other volumes, accounts, some photographs, drawings, printed ephemera, and so on. The bang-for-your-buck ratio is pretty high in Section A: though small, the collections tend to be well-described, meaning that there are regular reproduction and reference requests. Section A is used so often that in 2016, Rubenstein Research Services staff approached Digital Collections to propose a mass digitization project, re-purposing the existing catalog description into digital collections within our repository. This will allow remote researchers to browse all the collections easily, and also reduce repetitive reproduction requests.

This project has been met with enthusiasm and trepidation from staff since last summer, when we began to develop a cross-departmental plan to appraise, enhance description, and digitize the 3900 small manuscript collections that are housed in Section A. It took us a bit of time, partially due to the migration and other pressing IT priorities, but this month we are celebrating a major milestone: we have finally launched our first 2 Section A collections, meant to serve as a proof of concept, as well as a chance for us to firmly define the project’s goals and scope. Check them out: Abolitionist Speech, approximately 1850, and the A. Brouseau and Co. Records, 1864-1866. (Appropriately, we started by digitizing the collections that began with the letter A.)

A. Brouseau & Co. Records carpet receipts, 1865

Why has it been so complicated? First, the sheer number of collections is daunting; while there are plenty of digital collections with huge item counts already in the repository, they tend to come from a single or a few archival collections. Each newly-digitized Section A collection will be a new collection in the repository, which has significant workflow repercussions for the Digital Collections team. There is no unifying thread for Section A collections, so we are not able to apply metadata in batch like we would normally do for outdoor advertising or women’s diaries. Rubenstein Research Services and Library Conservation Department staff have been going box by box through the collections (there are about 25 collections per box) to identify out-of-scope collections (typically reference material, not primary sources), preservation concerns, and copyright concerns. These are excluded from the digitization process. Technical Services staff are also reviewing and editing the Section A collections’ description. This project has led to our enhancing some of our oldest catalog records — updating titles, adding subject or name access, and upgrading the records to RDA, a relatively new standard. Using scripts and batch processes (details on GitHub), the refreshed MARC records are converted to EAD files for each collection, and the digitized folder is linked through ArchivesSpace, our collection management system. We crosswalk the catalog’s name and subject access data to both the finding aid and the repository’s metadata fields, allowing the collection to be discoverable through the Rubenstein finding aid portal, the Duke Libraries catalog, and the Duke Digital Repository.

It has been really exciting to see the first two collections go live, and there are many more already digitized and just waiting in the wings for us to automate some of our linking and publishing processes. Another future development that we expect will speed up the project is a batch ingest feature for collections entering the repository. With over 3000 collections to ingest, we are eager to streamline our processes and make things as efficient as possible. Stay tuned here for more updates on the Section A project, and keep an eye on Digital Collections if you’d like to explore some of these newly-digitized collections.

The Research Data Team: Hitting the Ground Running

There has been a lot of blogging over the last year about the Duke Digital Repository’s development and implementation, about its growth as a platform and a program, and about the creation of new positions to support research data management and curation. My fellow digital content analyst also recently posted about how we four new hires have been creating and refining our research data curation workflow since beginning our positions at Duke this past January. It’s obviously been (and continues to be) a very busy time here for the repository team at Duke Libraries, including both seasoned and new staff alike.

Besides the research data workflows between our two departments, what other things have the data management consultants and the digital content analysts been doing? In short, we’ve been busy!

 

In addition to envisioning stakeholder needs (which is an exercise we continuously do), we’ve received and ingested several data collections this year, which has given us an opportunity to also learn from experience. We have been tracking and documenting the types of data we’re receiving, the various needs that these types of data and depositors have, how we approach these needs (including investigating and implementing any additional tools that may help us better address these), how our repository displays the data and associated metadata, and the time spent on our management and curation tasks. Some of these are in the form of spreadsheets, others as draft policies that will first be reviewed by the library’s research data working group and then by a program committee, and others simply as brain dumps for things that require a further, more structured investigation by developers, the metadata architect, subject librarians, and other stakeholders. These documents live in either our shared online folder or our shared Box account, and, if a wider Duke library and/or public audience are required, are moved to our departments’ content collaboration software platforms (currently Confluence/Jira and Basecamp). The collaborative environments of these platforms support the dynamic nature of our work, particularly as our program takes form.

We also value the importance of face-to-face discussions, so we hold weekly meetings to talk through all of this work (we prefer outside when the weather is nice, and because squirrels are awesome).

One of the most exciting, and at times challenging, aspects of where we are is that we are essentially starting from the ground up and therefore able to develop procedures and features (and re-develop, and on and on again) until we find fits that best accommodate our users and their data. We rely heavily on each other’s knowledge about the research data field, and we also engage in periodic environmental scans of other institutions that offer data management and curation services.

When we began in January, we all considered the first 6-9 months as a “pilot phase”, though this description may not be accurate. In the minds of the data management consultants and the digital content analysts, we’re here and ready. Will we run into situations that require an adjustment to our procedures? Absolutely. It’s the nature of our work. Do we want feedback from the Duke community about how our services are (or are not) meeting their needs? Without a doubt. And will the DDR team continue to identify and implement features to better meet end-user needs? Certainly. We fully expect to adjust and readjust our tools and services, with the overall goal of fulfilling future needs before they’re even evident to our users. So, as always, keep watching to see how we grow!

A Tough Nut to Crack: Developing Digital Repositories

Folks, developing digital repositories is hard.  There are so many different layers of complexity built into the stack, compounded by the unique variety of end-users, or stakeholders, that we serve.

Consider the breadth of this work:

Starting at the bottom of the stack, you have our Preservation layer.  This is where we capture your bits, and ensure the long-term preservation of your digital assets.  But it goes well beyond just logging a single record in a database.  It involves capturing the data stream, writing that file and all associated files (metadata) to storage, replicating the data to various geographically dispersed servers, validating the ingest, logging the validation, ensuring successful recovery of replicated assets, and more.

All of that comes post-ingest.  I’ll not even belabor the complexities of data modeling and ingest here, but you get the idea… it’s hairy stuff.  Receiving and massaging a highly diverse body of data into a package appropriate for homogeneous ingest is a monumental effort in normalization.

Move up the stack into our Curation layer.  Currently we have a single administrative application that facilitates management and curatorial activities of our digital objects following ingest.  Roles or access controls can be managed here, in addition to various types of metadata (description about the item), etc.  There are a variety of other applications that are managed at this layer, which interact with, and store, various values that fuel display and functionality within the user interface.  This layer is quickly evolving in a way that necessitates diversification.  We have found that a single monolithic application is not a one-size-fits-all solution for our stakeholders who are in the business of data production/curation; it is at this layer where we are getting increasingly more pressure to integrate and inter-operate with a myriad of other tools and platforms for resource/data management.  This is tricky business as each of these tools handle data in different ways.

Finally, we have the Discovery layer.  The user interface.  This is what the public sees and consumes.  It’s where access to ingested materials occurs.  It is itself an application requiring significant custom development to meet the needs of various programs and collections of materials.  It is tightly coupled with the Curation layer, and therefore highly complex and customized to meet the needs of different focal areas.  Search functionality is yet another piece of complexity that requires maintenance and customization of a central index.  Nothing is OOTB (out of the box).  Everything requires configuration and customization.

And ALL of this- all of it- is inter-related.  Highly coupled and complex.  Few things reap easy wins, and often our work challenges foundational assumptions that have come well before.  It’s an exercise in balancing technical debt and moving forward without re-inventing the wheel every six months.

What I have presented here is a simplistic view of our software eco-system.  It’s just a snapshot of the various puzzle pieces that support the operation of a production repository.  In general, digital repositories are still fairly new on the scene.  No one has them figured out entirely and everyone does them a little bit differently.  There’s a strength to that which manifests in diverse platforms and a breadth of development possibilities.  There’s a weakness to it because there is no cookie-cutter approach that defines an easy path forward.

So it’s an exercise in evolution.  In iteration.  In patience.  In requirements definition.  We’re not going to always get it right, and our efforts will largely take a bit of time and experimentation, but we’re constantly working to improve, to enhance, and to mature our repository platform to meet the growing and evolving needs of our University.

So, here’s to many years of hard work ahead!  And many successful collaborations with our Duke community to realize our repository’s future.  We’re ready if you are!

An Update on TRLN Discovery: A New Catalog for the Triangle Research Libraries Network.

We are still many months (well, OK — a year or more) away from unplugging the servers that keep the Endeca powered Search TRLN catalog and our local catalogs alive. But in the meantime work is well underway on its replacement. For those who don’t know, the libraries of Duke, UNC Chapel Hill, NC State, and NCCU have for many years maintained a shared catalog to make resources for all Triangle research libraries easily available to our communities. We all push our records to a centralized data pipeline that indexes our data in Endeca and even share much of the code that runs our local catalog user interfaces. While the browsing and searching capabilities of Endeca were innovative for library catalogs at the time, the rest of the library world has caught up and there are numerous open source solutions to providing convenient and powerful search and browse access to our holdings.

Some goals for the replacement for the Endeca based shared catalog system include:

  • Maintain functionality of the current catalog, including search & browse features.
  • Architect the new system so that it’s easier for staff to troubleshoot problems and test changes to the data pipeline.
  • Use open source tools already in common use by peer libraries to take advantage of community effort and knowledge.
  • Develop a platform that makes it easy for each institution to host their own copy of the catalog UI, that takes advantage of shared features and code, while allowing for local customization where needed.
  • Provide a centralized data service and index that can process and host all 12 million or so records from each institution that will provide the back end index for all local catalogs. (NCCU will use a separate vended solution for their catalog.)

The large team of librarians and staff from across the Research Triangle libraries have been hard at work planning for the new system and many characteristics of the new system are becoming clear:

  • We will use Solr as the replacement for Endeca. Solr will store and index our collective holdings and provide search and browse access to our holdings for our catalog applications.
  • Blacklight, a UI built to search and browse a Solr Index will provide the basis for our local catalogs. Blacklight is used by Stanford, Cornell, and many other libraries to provide the user interface to their catalogs.
  • We will build a Rails Engine that will add to Blacklight any customizations needed to support the consortial catalog and any additional features that the institutions want to share. This engine will make it easy to build a new Blacklight based catalog that will work with the TRLN Solr Index.
  • The data pipeline, Solr index and Catalog UI will be packaged in such a way that staff can run a near copy of the production system locally to develop and test changes.

What we have so far:

  • Scripts built with Traject that will take our MARC XML or MARC Binary records and transform them into an intermediate JSON format.
  • Scripts that take the JSON data and index the data in Solr.
  • The beginnings of a Rails engine that will form the basis of our Blacklight catalog user interfaces.

The technical implementation team is using the TRLN GitHub account to collaborate on development, and much work-in-progress code is posted there. There is still plenty of work left to do, a long list of requirements to accommodate and many unanswered questions, but the project is well underway to build the next shared catalog for the Triangle Research Libraries.

508 Update, Update

A little more than a year ago, I wrote about the proposed update to the 508 accessibility standards. And about three weeks ago, the US Access Board published the final rule that contains updates to the 508 accessibility requirements for Information and Communication Technology (ICT). The rules had not previously been updated since 2001 and as such had greatly lagged behind modern web conventions.

It’s important to note that the 508 guidelines are intended to serve as a vehicle for guiding procurement, while at the same time applying to content created by a given group/agency. As such, the language isn’t always straightforward.

What’s new?

As I outlined in my previous post, a major purpose of the new rule is to move away from regulating types of devices and instead focus on functionality:


… one of the primary purposes of the final rule is to replace the current product-based approach with requirements based on functionality, and, thereby, ensure that accessibility for people with disabilities keeps pace with advances in ICT.


To that effect, one of the biggest change over the old standard is the adoption of WCAG 2.0 as the compliance level. The fundamental premise of WCAG compliance is that content is ‘perceivable, operable, and understandable’ — bottom line is that as developers, we should strive to make sure all of our content is usable for everyone across all devices. The adoption of WCAG allows the board to offload responsibility of making incremental changes as technology advances (so we don’t have to wait another 15 years for updates) and also aligns our standards in the United States with those used around the world.


Harmonization with international standards and guidelines creates a larger marketplace for accessibility solutions, thereby attracting more offerings and increasing the likelihood of commercial availability of accessible ICT options.


Another change has to do with making a wider variety of electronic content accessible, including internal documents. It will be interesting to see to what degree this part of the rule is followed by non-federal agencies.


The Revised 508 Standards specify that all types of public-facing content, as well as nine categories of non-public-facing content that communicate agency official business, have to be accessible, with “content” encompassing all forms of electronic information and data. The existing standards require Federal agencies to make electronic information and data accessible, but do not delineate clearly the scope of covered information and data. As a result, document accessibility has been inconsistent across Federal agencies. By focusing on public-facing content and certain types of agency official communications that are not public facing, the revised requirements bring needed clarity to the scope of electronic content covered by the 508 Standards and, thereby, help Federal agencies make electronic content accessible more consistently.


The new rules do not go into effect until January 2018. There’s also a ‘safe harbor’ clause that protects content that was created before this enforcement date, assuming it was in compliance with the old rules. However, if you update that content after January, you’ll need to make sure it complies with the new final rule.


Existing ICT, including content, that meets the original 508 Standards does not have to be upgraded to meet the refreshed standards unless it is altered. This “safe harbor” clause (E202.2) applies to any component or portion of ICT that complies with the existing 508 Standards and is not altered. Any component or portion of existing, compliant ICT that is altered after the compliance date (January 18, 2018) must conform to the updated 508 Standards.


So long story short, a year from now you should make sure all the content you’re creating meets the new compliance level.

Revisiting: What is the Repository?

Here at the Duke University Libraries we recently hosted a series of workshops that were part of a larger Research Symposium on campus.  It was an opportunity for various campus agencies to talk about all of the evolving and innovative ways that they are planning for and accommodating research data.  A few of my colleagues and I were asked to present on the new Research Data program that we’re rolling out in collaboration with the Duke Digital Repository, and we were happy to oblige!

I was asked to speak directly about the various software development initiatives that we have underway with the Duke Digital Repository.  Since we’re in the midst of rolling out a brand new program area, we’ve got a lot of things cooking!

When I started planning for the conversation I initially thought I would talk a lot about our Fedora/Hydra stack, and the various inter-related systems that we’re planning to integrate into our repository eco-system.  But what resulted from that was a lot of technical terms, and open-source software project names that didn’t mean a whole lot to anyone; especially those not embedded in the work.  As a result, I took a step back and decided to focus at a higher level.  I wanted to present to our faculty that we were implementing a series of software solutions that would meet their needs for accommodation of their data.  This had me revisiting the age-old question: What is our Repository?  And for the purposes of this conversation, it boiled down to this:

And this:

It is a highly complex, often mind-boggling set of software components, that are wrangled and tamed by a highly talented team with a diversity of skills and experience, all for the purposes of supporting Preservation, Curation, and Access of digital materials.

Those are our tenets or objectives.  They are the principles that guide out work.  Let’s dig in a bit on each.

Our first objection is Preservation.  We want our researchers to feel 100% confident that when they give us their data, that we are preserving the integrity, longevity, and persistence of their data.

Our second objective is to support Curation.  We aim to do that by providing software solutions that facilitate management and description of file sets, and logical arrangement of complex data sets.  This piece is critically important because the data cannot be optimized without solid description and modeling that informs on its purpose, intended use, and to facilitate discovery of the materials for use.

Finally our work, our software, aims to facilitate discovery & access.  We do this by architecture thoughtful solutions that optimize metadata and modeling, we build out features that enhance the consumption and usability of different format types, we tweak, refine and optimize our code to enhance performance and user experience.

The repository is a complex beast.  It’s a software stack, and an eco-system of components.  It’s Fedora.  It’s Hydra.  It’s a whole lot of other project names that are equally attractive and mystifying.  At it’s core though, it’s a software initiative- one that seeks to serve up an eco-system of components with optimal functionality that meet the needs and desires of our programmatic stakeholders- our University.

Preservation, Curation, & Access are the heart of it.