All posts by Sean Aery

Highlighting Duke Scholars Alongside Their Research

2018 has featured several monumental changes in the library’s technology platforms. One of the most impactful shifts this year was revitalizing DukeSpace, our DSpace-based institutional repository (IR) software, home to over 16,000 open-access articles, theses, and dissertations from Duke scholars.

Back in March, we celebrated a successful multi-version upgrade for DSpace, and along with it, a major upgrade to the integral Symplectic Elements Research Information Management platform.  On the heels of that project, we decided to capitalize on the project team’s momentum and invest two more months of focused attention (four “sprints” in developer-speak).

The goals for that period? First, tie up the loose ends from the upgrade. Then, seize some clear opportunities to build upon our freshly-rearchitected metadata, creating innovative features in the UI to benefit scholars. By scholars, we mean — in part — the global audience of researchers openly discovering and using the articles in DukeSpace. But we especially mean the scholars at Duke who created them in the first place.

We are excited to share the results of our work with the Duke community and beyond. Here are the noteworthy additions:

Scholars@Duke Author Profiles

Item pages now display a brief embedded profile for each Duke author, featuring their preferred name, a photo, position title, and brief description of their research interests. This information comes from the scholars themselves, who manage their profiles via Scholars@Duke (powered by the open-source VIVO platform).

Scholars@Duke & DukeSpace
A brief author profile in a DukeSpace item  leads to the full profile in Scholars@Duke.

Scholars@Duke provides a handy SEO-friendly profile page (example) for each scholar. It aggregates their full list of publications, courses taught, news articles in which they’re mentioned, and much more. It also has useful APIs for building widgets to dynamically repurpose the data for display elsewhere. Departments throughout Duke (example) use these APIs to display current faculty information on their web sites without requiring anyone to manually duplicate or update it. And now, the library does, too.

Scholars@Duke profile in DukeSpace
Scholars’ own descriptions of their research interests entice a visitor to learn more about their work.

Featuring researchers in this manner adds several other benefits, including:

  • Uses a scholar’s own preferred current version of their name and title; that may not be identical to what is stored in the item’s author metadata.
  • Puts users one click away from the author’s full profile at Scholars@Duke, where one can discover the entirety of the author’s publications, open-access or not.
  • Helps search engines make stronger semantic connections between an author’s profile information and their works available online.
  • Introduces a unique value-add feature for our open-access copy of an article that’s unlikely to ever be possible to replicate for the published version on the academic journal’s website.
  • Makes the DukeSpace item pages look better, warmer, and more inviting.
Scholars@Duke multiple author profiles in DukeSpace
Profiles for multi-author articles

With this feature, we are truly pushing beyond the boundaries of what an institutional repository traditionally does. And likewise, we feel we’re raising the bar for how academic research libraries can showcase the members of their communities alongside their collected works.

Other New Features

Beyond these new author profiles, we managed to fit in a few more enhancements around citations, the homepage, and site navigation. Here’s a quick rundown:

Citations

We now present an easily copyable citation, composed from the various metadata available for each item. This includes the item’s permalink.

DukeSpace citation for a dissertation
DukeSpace citation for a dissertation

In cases when there’s a published version of an article available, we direct users to review and use that citation instead.

DukeSpace citation with DOI
DukeSpace article with a DOI for a published version.

 

Item pages also now display a “Citation Stats” badge in the sidebar, powered by Digital Science’s Dimensions tool.  Click it to explore other scholarly work that has cited the current item.

Homepage

Finally, we topped off this project phase by redesigning DukeSpace’s homepage. Notable among the changes: a clearer indication of the number of items (broken down by type), a dynamic list of trending items, and streamlined menu navigation in the sidebar.

 

Final Thoughts

Duke Libraries’ current strategic plan emphasizes the mission-critical importance of our open-access publishing and repository efforts, and also demands that we “highlight and promote the scholarly activities of our faculty, students, and staff.”  This two-month DukeSpace enhancements project was a great opportunity for us to think outside the box about our technology platforms, and consider how those goals relate.


Many thanks to several people whose work enabled these features to come to life, especially Maggie Dickson, Hugh Cayless, Paolo Mangiafico, and the Scholars@Duke team.

Revitalizing DSpace at Duke

Near the tail end of 2017, the Duke Libraries committed to a major multi-version upgrade for DukeSpace (powered by the open-source repository platform DSpace), and assembled an Avengers-like team to combine its members’ complementary powers to conquer it together.  The team persisted through several setbacks and ultimately prevailed in its mission. The new site launched successfully in March 2018.

That same team is now back for a sequel, collaborating to tackle additional issues around system integrations, statistics/reporting, citations, and platform maintenance. Phase II of the project will wrap up this summer.

I’d like to share a bit more about the DSpace upgrade project, beginning with some background on why it’s important and where the platform fits into the larger picture at Duke. Then I’ll share more about the areas to which we have devoted the most developer time and attention over the past several months.   Some of the development efforts were required to make DSpace 6 viable at all for Duke’s ongoing needs. Other efforts have been to strengthen connections between DukeSpace and other platforms.  We have also been enhancing several parts of the user interface to optimize its usability and visual appeal.

DSpace at Duke: What’s in It?

Duke began using DSpace around 2006 as a solution for Duke University Archives to collect and preserve electronic theses and dissertations (ETDs). In 2010, the university adopted an Open Access policy for articles authored by Duke faculty, and DukeSpace became the host platform to make these articles accessible under the policy. These two groups of materials represent the vast majority of the 15,000+ items currently in the platform. Ensuring long-term preservation, discovery, and access to these items is central to the library’s mission.

Integrations With Other Systems

DukeSpace is one of three key technology platforms working in concert to support scholarly communications at Duke. The other two are the proprietary Research Information Management System Symplectic Elements, and the open-source research networking tool VIVO (branded as Scholars@Duke). Here’s a diagram illustrating how the platforms work together, created by my colleague Paolo Mangiafico:

Credit: Paolo Mangiafico

 

In a nutshell, DSpace plays a critical role in Duke University scholars’ ability to have their research easily discovered, accessed, and used.

  • Faculty use Elements to manage information about their scholarly publications. That information is pulled neatly into Scholars@Duke which presents for each scholar an authoritative profile that also includes contact info, courses taught, news stories in which they’re mentioned,  and more.
  • The Scholars@Duke profile has an SEO-friendly URL, and the data from it is portable: it can be dynamically displayed anywhere else on the web (e.g., departmental websites).
  • Elements is also the place where faculty submit the open access copies of their articles; Elements in turn deposits those files and their metadata to DSpace. Faculty don’t encounter DSpace at all in the process of submitting their work.
  • Publications listed in a Scholars@Duke profile automatically include a link to the published version (which is often behind a paywall), and a link to the open access copy in DSpace (which is globally accessible).

Upgrading DSpace: Ripple Effects

The following diagram expands upon the previous one. It adds boxes to the right to account for ETDs and other materials deposited to DSpace either by batch import mechanisms or directly via the application’s web input forms. In a vacuum, a DSpace upgrade–complex as that is in its own right–would be just the green box. But as part of an array of systems working together, the upgrade meant ripping out and replacing so much more. Each white star on the diagram represents a component that had to be thoroughly investigated and completely re-done for this upgrade to succeed.

One of the most complicated factors in the upgrade effort was the bidirectional arrow marked “RT2”:  Symplectic’s new Repository Tools 2 connector. Like its predecessor RT1, it facilitates the deposit of files and metadata from Elements into DSpace (but now via different mechanisms). Unlike RT1, RT2 also permits harvesting files and metadata from DSpace back into Elements, even for items that weren’t originally deposited via Elements.  The biggest challenges there:

  • Divergent metadata architecture. DukeSpace and Elements employ over 60 metadata fields apiece (and they are not the same).
  • Crosswalks. The syntax for munging/mapping data elements from Elements to DSpace (and vice versa) is esoteric, new, and a moving target.
  • Legacy/inconsistent data. DukeSpace metadata had not previously been analyzed or curated in the 12 years it had been collected.
  • Newness. Duke is likely the first institution to integrate DSpace 6.x & Elements via RT2, so a lot had to be figured out through trial & error.

Kudos to superhero metadata architect Maggie Dickson for tackling all of these challenges head-on.

User Interface Enhancements in Action

There are over 2,000 DSpace instances in the world. Most implementors haven’t done much to customize the out-of-the-box templates, which look something like this for an item page:

DSpace interface out of the box. From http://demo.dspace.org/xmlui/

The UI framework itself is outdated (driven via XSLT 1.0 through Cocoon XML pipelines), which makes it hard for anyone to revise substantially. It’s a bit like trying to whittle a block of wood into something ornate using a really blunt instrument. The DSpace community is indeed working on addressing that for DSpace 7.0, but we didn’t have the luxury to wait. So we started with the vanilla template and chipped away at it, one piece at a time. These screenshots highlight the main areas we have been able to address so far.

Bootstrap / Bootswatch Theme

We layered on the same adapted Bootswatch theme in use by the Duke Libraries’ Drupal website and Duke Digital Repository, then applied the shared library masthead. This gives DukeSpace a fairly common look and feel with the rest of the library’s web presence.

Images, Icons, and Filesizes

We configured DSpace to generate and display thumbnail images for all items. Then we added icons corresponding to MIME types to help distinguish different kinds of files. We added really prominent indicators for when an item was embargoed (and when it would become available), and also revised the filesize display to be more clear and concise.

Usage & Attention Stats

Out of the box, DSpace item statistics are only available by clicking a link on the item page to go to a separate stats page. We figured out how to tap into the Solr statistics core and transform that data to display item views and file downloads directly in the item sidebar for easier access. We were also successful showing an Altmetric donut badge for any article with a DOI. These features together help provide a clear indication on the item page how much of an impact a work has made.

Rights

We added a lookup from the item page to retrieve the parent collection’s rights statement, which may contain a statement about Open Access, a Creative Commons license, or other explanatory text. This will hopefully assert rights information in a more natural spot for a user to see it, while at the same time draw more attention to Duke’s Open Access policy.

Scholars@Duke Profiles & ORCID Links

For any DukeSpace item author with a Scholars@Duke profile, we now display a clickable icon next to their name. This leads to their Scholars@Duke profile, where a visitor can learn much more about the scholar’s background, affiliations, and other research. Making this connection relies on some complicated parts: 1) enable getting Duke IDs automatically from Elements or manually via direct entry; 2) storing the ID in a DSpace field; 3) using the ID to query a VIVO API to retrieve the Scholars@Duke profile URL. We are able to treat a scholar’s ORCID in a similar fashion.

Other Development Areas

Beyond the public-facing UI, these areas in DSpace 6.2 also needed significant development for the upgrade project to succeed:

  • Fixed several bugs related to batch metadata import/export
  • Developed a mechanism to create user accounts via batch operations
  • Modified features related to authority control for metadata values

Coming Soon

By summer 2018, we aim to have the following in place:

Streamlined Sidebar

Add collapsable / expandable facet and browse options to reduce the number of menu links visible at any given time.

Citations

Present a copyable citation on the item page.


…And More!

  • Upgrade the XSLT processor from Xalan to Saxon, using XLST 3.0; this will enable us to accomplish more with less code going forward
  • Revise the Scholars@Duke profile lookup by using a different VIVO API
  • Create additional browse/facet options
  • Display aggregated stats in more places

We’re excited to get all of these changes in place soon. And we look forward to learning more from our users, our collaborators, and our peers in the DSpace community about what we can do next to improve upon the solid foundation we established during the project’s initial phases.

A New & Improved Rubenstein Library Website

We kicked off the spring 2018 semester by rolling out a brand-new design for the David M. Rubenstein Library website.  The new site features updated imagery from the collections, better navigation, and more prominent presence for the exhibits currently on display.

Rubenstein website, Jan 2018
January 2018: New design for the Rubenstein Library homepage.

 

Much credit goes to Katie Henningsen and Kate Collins who championed the project.

Objectives for the Redesign

  • Make wayfinding from the homepage clearer (by reorganizing links into a primary dropdown navigation)
  • Dynamically feature Rubenstein Library exhibits that are currently on display
  • Improve navigation to key Rubenstein site pages from within research center / collection pages
  • Display larger images illustrative of the library’s distinctive and diverse collections
  • Retain aspects of the homepage that have been effective, e.g., hours and resource search boxes
  • Improve the site aesthetic

Internal Navigation

With a new primary navigation in hand on the Rubenstein homepage that links to key pages in the site, we began to explore ways to get visitors to those links in an unobtrusive way when they aren’t on the homepage.  Each research center within the library, e.g., the John W. Hartman Center for Sales, Advertising & Marketing History, has its own sub-site with its own secondary menus, which already contend a bit with the blue Duke Libraries menu in the masthead.  To avoid burying visitors in a Russian nesting doll of navigation, we decided to try dropping the RL menu down from the breadcrumb trail link so it’s tucked away, but still accessible when needed. We’re eager to learn whether or not this is effective.

Rubenstein primary navigation menu available from breadcrumb trail.

A Look Back

Depending on how you count, this is now the seventh or eighth homepage design for the Rubenstein Library (formerly the Rare Book, Manuscript, and Special Collections Library; formerly the Special Collections Library). I thought I’d take a quick stroll down memory lane, courtesy of the Internet Archive’s Wayback Machine, to reflect on how far we have come over the years.

1996

Duke Special Collections Library website, 1996
October 1996 (explore)

Features:

  • prominent news, exhibits, and online collections
  • links to online SGML- and HTML-encoded finding aids (42 of them!)
  • a site search box powered by Excite!

1997

April 1997 (explore)

Features:

  • two-column layout with a left-hand nav
  • digitized collections
  • a special collections newsletter called The Broadside
  • became the “Rare Book, Manuscript, and Special Collections Library” in 1997

2005

December 2005 (explore Jan 2006)

Features:

  • color-coded navigation broken into three groups of links
  • image from the collections
  • featured exhibit with image
  • rounded corners and shadows
  • first use of a CMS (content management system named Cascade Server)*

2007

August 2007 (explore)

Features:

  • first time sharing a masthead with rest of the Duke University Libraries
  • retained the lists of links, single collection image, and featured exhibit from previous iteration

2011

September 2011 (explore)

Features:

  • renamed as the David M. Rubenstein Rare Book & Manuscript Library
  • first time with catalog and finding aids search boxes on the homepage
  • first appearance of social media & RSS icons
  • first iteration to display library hours
  • first news carousel appearance

2014

January 2014 (explore)

Features:

  • new site in Drupal content management system
  • first responsive RL website (works well on mobile devices)
  • array of vertical image panels from the collections
  • extended color palette to match Duke University website styles (at the time)
  • gradients and rounded buttons with shadows
  • first time able to search digital collections from RL homepage
  • first site with Login button for Aeon (Special Collections request system)

2017

January 2017 (explore)

Features:

2018

Rubenstein website, Jan 2018
January 2018

Features

  • lightened the overall aesthetic
  • featured image cycling from selections at random (diagonally sliced using css clip-path polygons)
  • prominent current exhibits feed with images
  • a primary nav with dropdown menus

How long will this latest edition of the Rubenstein Library homepage stick around? Only time will tell, but we’ll surely continue to iterate, learn from the past, and improve with each attempt. For now, we’re pleased with the new site, and hope you will be as well.


* Revised Feb 9, 2018 to reflect that the first version using a content management system was in 2005 rather than 2007.

Accessible AV in the Duke Digital Repository

Over the course of 2017, we improved our capacity to support digital audiovisual materials in the Duke Digital Repository (DDR) by leaps and bounds. A little more than a year ago, I had written a Bitstreams blog post highlighting the new features we had just developed in the DDR to provide basic functionality for AV, especially in support of the Duke Chapel Recordings collection. What a difference a year makes.

This past year brought renewed focus on AV development, as we worked to bring the NEH grant-funded Radio Haiti Archive online (launched in June).  At the same time, our digital collections legacy platform migration efforts shifted toward moving our existing high-profile digital AV material into the repository.

Closed Captions

At Duke University Libraries, we take accessibility seriously. We aim to include captions or transcripts for the audiovisual objects made available via the Duke Digital Repository, especially to ensure that the materials can be perceived and navigated by people with disabilities. For instance, work is well underway to create closed captions for all 1,400 items in the Duke Chapel Recordings project.

Screenshot showing Charmin commercial from AdViews collection with caption overlay
Captioned video displays a CC button and shows captions as an overlay in the video player. Example from the AdViews collection, coming soon to the DDR.

The DDR now accommodates modeling and ingest for caption files, and our AV player interface (powered by JW Player) presents a CC button whenever a caption file is available. Caption files are encoded using WebVTT, the modern W3C standard for associating timed text with HTML audio and video. WebVTT is structured so as to be machine-processable, while remaining lightweight enough to be reasonably read, created, or edited by a person. It’s a format that transcription vendors can provide. And given its endorsement by W3C, it should be a viable captioning format for a wide range of applications and devices for the foreseeable future.

Example WebVTT captions
Text cues from a WebVTT caption file for an audio item in the Duke Chapel Recordings collection.

Interactive Transcripts

Displaying captions within the player UI is helpful, but it only gets us so far. For one, that doesn’t give a user a way to just read the caption text without requiring them to play the media. We also need to support captions for audio files, but unlike with video, the audio player doesn’t include enough real estate within itself to render the captions. There’s no room for them to appear.

So for both audio and video, our solution is to convert the WebVTT caption files on-the-fly into an interactive in-page transcript. Using the webvtt-ruby gem (developed by Coconut) , we parse the WebVTT text cues into Ruby objects, then render them back on the page as HTML. We then use the JWPlayer Javascript API to keep the media player and the HTML transcript in sync. Clicking on a transcript cue advances the player to the corresponding moment in the media, and the currently-playing cue gets highlighted as the media plays.

Screenshot of interactive audio transcript
Example interactive synchronized transcript for an audio item (rendered from a WebVTT caption file). From a collection coming soon to the DDR.

 

We also do some extra formatting when the WebVTT cues include voice tags (<v> tags), which can optionally indicate the name of the speaker (e.g., <v Jane Smith>). The in-page transcript is indexed by Google for search retrieval.

Transcript Documents

In many cases, especially for audio items, we may have only a PDF or other type of document with a transcript of a recording that isn’t structured or time-coded.  Like captions, these documents are important for accessibility. We have developed support for displaying links to these documents near the media player. Look for some new collections using this feature to become available in early 2018.

Screenshot of a transcript document menu above the AV player
Transcript documents presented above the media player. Coming soon to AV collections in the DDR.

 

 

A/V Embedding

The DDR web interface provides an optimal viewing or listening experience for AV, but we also want to make it easy to present objects from the DDR on other websites, too. When used on other sites, we’d like the objects to include some metadata, a link to the DDR page, and proper attribution. To that end, we now have copyable <iframe> embed code available from the Share menu for AV items.

Embed code in the Share menu for an audio item.
Copyable embed code from an audio recording in the Radio Haiti Archive.

This embed code is also what we now use within the Rubenstein Library collection guides (finding aids) interface: it lets us present digital objects from the DDR directly from within a corresponding collection guide. So as a researcher browses the inventory of a physical archival collection, they can play the media inline without having to leave.

Screenshot of Rubenstein Library collection guide presenting a Duke Chapel Recordings video inline.
Embedded view of a DDR video from the Duke Chapel Recordings collection presented inline  in a Rubenstein Library archival collection guide.

Sites@Duke Integration
If your website or blog is one of the thousands of WordPress sites hosted and supported by Sites@Duke — a service of Duke’s Office of Information Technology (OIT) — we have good news for you. You can now embed objects from the DDR using WordPress shortcode. Sites@Duke, like many content management systems, doesn’t allow authors to enter <iframe> tags, so shortcode is the only way to get embeddable media to render.

Example of WordPress shortcode for DDR embedding on Sites@Duke.edu sites.
Sites@Duke WordPress sites can embed DDR media by using shortcode with the DDR item’s permalink.

 

And More!

Here are the other AV-related features we have been able to develop in 2017:

  • Access control: master files &  derivatives alike can be protected so access is limited to only authorized users/groups
  • Video thumbnail images: model, manage, and display
  • Video poster frames: model, manage, and display
  • Intermediate/mezzanine files: model and manage
  • Rights display: display icons and info from RightsStatements.org and Creative Commons, so it’s clear what users are permitted to do with media.

What’s Next

We look forward to sharing our recent AV development with our peers at the upcoming Samvera Connect conference (Nov 6-9, 2017 in Evanston, IL). Here’s our poster summarizing the work to date:

Poster presentation screenshot for Samvera Connect 2017
Poster about Duke’s AV development for Samvera Connect conference, Nov 6-9, 2017 (Evanston, IL)

Looking ahead to the next couple months, we aim to round out the year by completing a few more AV-related features, most notably:

  • Export WebVTT captions as PDF or .txt
  • Advance the player via linked timecodes in the description field in an item’s metadata
  • Improve workflows for uploading caption files and transcript documents

Now that these features are in place, we’ll be sharing a bunch of great new AV collections soon!

Turning on the Rights in the Duke Digital Repository

As 2017 reaches its halfway point, we have concluded another busy quarter of development on the Duke Digital Repository (DDR). We have several new features to share, and one we’re particularly delighted to introduce is Rights display.

Back in March, my colleague Maggie Dickson shared our plans for rights management in the DDR, a strategy built upon using rights status URIs from RightsStatements.org, and in a similar fashion, licenses from Creative Commons. In some cases, we supplement the status with free text in a local Rights Note property. Our implementation goals here were two-fold: 1) use standard statuses that are machine-readable; 2) display them in an easily understood manner to users.

New rights display feature in action on a digital object.

What to Display

Getting and assigning machine-readable URIs for Rights is a significant milestone in its own right. Using that value to power a display that makes sense to users is the next logical step. So, how do we make it clear to a user what they can or can’t do with a resource they have discovered? While we could simply display the URI and link to its webpage (e.g., http://rightsstatements.org/vocab/InC-EDU/1.0/ ) the key info still remains a click away. Alternatively, we could display the rights statement or license title with the link, but some of them aren’t exactly intuitive or easy on the eyes. “Attribution-NonCommercial-NoDerivatives 4.0 International,” anyone?

Inspiration

Looking around to see how other cultural heritage institutions have solved this problem led us to very few examples. RightsStatements.org is still fairly new and it takes time for good design patterns to emerge. However, Europeana — co-champion of the RightsStatements.org initiative along with DPLA — has a stellar collections site, and, as it turns out, a wonderfully effective design for displaying rights statuses to users. Our solution ended up very much inspired by theirs; hats off to the Europeana team.

Image from Europeana site.
Europeana Collections UI.

Icons

Both Creative Commons and RightsStatements.org provide downloadable icons at their sites (here and here). We opted to store a local copy of the circular SVG versions for both to render in our UI. They’re easily styled, they don’t take up a lot of space, and used together, they have some nice visual unity.

Rights & Licenses Icons
Circular icons from Creative Commons & RightsStatements.org

Labels & Titles

We have a lightweight Rails app with an easy-to-use administrative UI for managing auxiliary content for the DDR, so that made a good home for our rights statuses and associated text. Statements are modeled to have a URI and Title, but can also have three additional optional fields: short title, re-use text, and an array of icon classes.

Editing rights info associated with each statement.

Displaying the Info

We wanted to be sure to show the rights status in the flow of the rest of an object’s metadata. We also wanted to emphasize this information for anyone looking to download a digital object. So we decided to render the rights status prominently in the download menu, too.

Rights status in download menu
Rights status displays in the download menu.

 

Rights status also displays alongside other metadata.

What’s Next

Our focus in this area now shifts toward applying these newly available rights statuses to our existing digital objects in the repository, while ensuring that new ingests/deposits get assessed and assigned appropriate values. We’ll also have opportunities to refine where and how the statuses get displayed. We stand to learn a lot from our peer organizations implementing their own rights management strategies, and from our visitors as they use this new feature on our site. There’s a lot of work ahead, but we’re thrilled to have reached this noteworthy milestone.

A Refreshing New Look for Our Library Website

If you’ve visited the Duke University Libraries website in the past month, you may have noticed that it looks a bit more polished than it used to. Over the course of the fall 2016 semester, my talented colleague Michael Daul and I co-led a project to develop and implement a new theme for the site. We flipped the switch to launch the theme on January 6, 2017, the week before spring classes began. In this post, I’ll share some background on the project and its process, and highlight some noteworthy features of the new theme we put in place.

Newly refreshed Duke University Libraries website homepage.

Goals

We kicked off the project in Aug 2016 using the title “Website Refresh” (hat-tip to our friends at NC State Libraries for coining that term). The best way to frame it was not as a “redesign,” but more like a 50,000-mile maintenance tuneup for the site.  We had four main goals:

  • Extend the Life of our current site (in Drupal 7) without a major redesign or redevelopment effort
  • Refresh the Look of the site to be modern but not drastically different
  • Better Code by streamlining HTML markup & CSS style code for easier management & flexibility
  • Enhance Accessibility via improved compliance with WCAG accessibility guidelines

Our site is fairly large and complex (1,200+ pages, for starters). So to keep the scope lean, we included no changes in content, information architecture, or platform (i.e., stayed on Drupal 7). We also worked with a lean stakeholder team to make decisions related to aesthetics.

Extending the Life of the Site

Our old website theme was aging; the project leading to its development began five years ago in Sep 2012, was announced in Jan 2013, and then eventually launched about three years ago in Jan 2014. Five years–and even three–is a long time in web years. Sites accumulate a lot of code cruft over time, the tools for managing and writing code become deprecated quickly. We wanted to invest a little time now to replace some pieces of the site’s front-end architecture with newer and better replacements, in order to buy us more time before we’d have to do an expensive full-scale overhaul from the ground up.

Refreshing the Look

Our 2014 site derived a lot its aesthetic from the main Duke.edu website at the time. Duke’s site has changed significantly since then, and meanwhile, web design trends have changed dramatically: flat design is in, skeuomorphism out.  Google Web Fonts are in, Times, Arial, Verdana and company are out.  Even a three year old site on the web can look quite dated.

Old site theme, dated aesthetics.
New “refreshed” theme, with flatter, more modern aesthetic

Closeup on skeuomorphic embellishments vs. flat elements.

Better Code

Beyond evolving aesthetics, the various behind-the-scenes web frameworks and code workflows are in constant, rapid flux; it can really keep a developer’s head on a swivel. Better code means easier maintenance, and to that end our code got a lot better after implementing these solutions:

  • Bootstrap Upgrade. For our site’s HTML/CSS/JS framework, we moved from Bootstrap version 2 (2.3.1) to version 3 (3.3.7). This took weeks of work: it meant thousands of pages of markup revisions, only some of which could be done with a global Search & Replace.
  •  Sass for CSS. We trashed all of our old theme’s CSS files and started over using Sass, a far more efficient way to express and maintain style rules than vanilla CSS.
  • Gulp for Automation. Our new theme uses Gulp to automate code tasks like processing Sass into CSS, auto-prefixing style declarations to work on older browsers, and crunching 30+ css files down into one.
  • Font Awesome. We ditched most of our older image-based icons in favor of Font Awesome ones, which are far easier to reference and style, and faster to load.
  • Radix.  This was an incredibly useful base theme for Drupal that encapsulates/integrates Sass, Gulp, Bootstrap, and FontAwesome. It also helped us get a Bootswatch starter theme in the mix to minimize the local styling we had to do on top of Bootstrap.

We named our new theme Dulcet and put it up on GitHub.

Sass for style management, e.g., expressing colors as reusable variables.
Gulp for task automation, e.g., auto-prefixing styles to account for older browser workarounds.

 Accessibility

Some of the code and typography revisions we’ve made in the “refresh” improve our site’s compliance with WCAG2.0 accessibility guidelines. We’re actively working on further assessment and development in this area. Our new theme is better suited to integrate with existing tools, e.g., to automatically add ARIA attributes to interactive page elements.

Feedback or Questions?

We would love to hear from you if you have any feedback on our new site, if you spot any oddities, or if you’re considering doing a similar project and have any questions. We encourage you to explore the site, and hope you find it a refreshing experience.

Go Faster, Do More, Integrate Everything, and Make it Good

We’re excited to have released nine digitized collections online this week in the Duke Digital Repository (see the list below ). Some are brand new, and the others have been migrated from older platforms. This brings our tally up to 27 digitized collections in the DDR, and 11,705 items. That’s still just a few drops in what’ll eventually be a triumphantly sloshing bucket, but the development and outreach we completed for this batch is noteworthy. It changes the game for our ability to put digital materials online faster going forward.

Let’s have a look at the new features, and review briefly how and why we ended up here.

Collection Portals: No Developers Needed

Mangum Photos Collection
The Hugh Mangum Photographs collection portal, configured to feature selected images.

Before this week, each digital collection in the DDR required a developer to create some configuration files in order to get a nice-looking, made-to-order portal to the collection. These configs set featured items and their layout, a collection thumbnail, custom rules for metadata fields and facets, blog feeds, and more.

 

Duke Chapel Recordings Portal
The Duke Chapel Recordings collection portal, configured with customized facets, a blog feed, and images external to the DDR.

It’s helpful to have this kind of flexibility. It can enhance the usability of collections that have distinctive characteristics and unique needs. It gives us a way to show off photos and other digitized images that’d otherwise look underwhelming. But on the other hand, it takes time and coordination that isn’t always warranted for a collection.

We now have an optimized default portal display for any digital collection we add, so we don’t need custom configuration files for everything. A collection portal is not as fancy unconfigured, but it’s similar and the essential pieces are present. The upshot is: the digital collections team can now take more items through the full workflow quickly–from start to finish–putting collections online without us developers getting in the way.

Whitener Collection Portal
A new “unconfigured” collection portal requiring no additional work by developers to launch. Emphasis on archival source collection info in lieu of a digital collection description.

Folder Items

To better accommodate our manuscript collections, we added more distinction in the interface between different kinds of image items. A digitized archival folder of loose manuscript material now includes some visual cues to reinforce that it’s a folder and not, e.g., a bound album, a single photograph, or a two-page letter.

Folder items
Folder items have a small folder icon superimposed on their thumbnail image.
Folder item view
Above the image viewer is a folder icon with an image count; the item info header below changes to “Folder Info”

We completed a fair amount of folder-level digitization in recent years, especially between 2011-2014 as part of a collaborative TRLN Large-Scale Digitization IMLS grant project.  That initiative allowed us to experiment with shifting gears to get more digitized content online efficiently. We succeeded in that goal, however, those objects unfortunately never became accessible or discoverable outside of their lengthy, text-heavy archival collection guides (finding aids). They also lacked useful features such as zooming, downloading, linking, and syndication to other sites like DPLA. They were digital collections, but you couldn’t find or view them when searching and browsing digital collections.

Many of this week’s newly launched collections are composed of these digitized folders that were previously siloed off in finding aids. Now they’re finally fully integrated for preservation, discovery, and access alongside our other digital collections in the DDR. They remain viewable from within the finding aids and we link between the interfaces to provide proper context.

Keyboard Nav & Rotation

Two things are bound to increase when digitizing manuscripts en masse at the folder level: 1) the number of images present in any given “item” (folder); 2) the chance that something of interest within those pages ends up oriented sideways or upside-down. We’ve improved the UI a bit for these cases by adding full keyboard navigation and rotation options.

Rotate Image Feature
Rotation options in the image viewer. Navigate pages by keyboard (Page Up/Page Down on Windows, Fn+Up/Down on Mac).

Conclusion

Duke Libraries’ digitization objectives are ambitious. Especially given both the quality and quantity of distinctive, world-class collections in the David M. Rubenstein Library, there’s a constant push to: 1) Go Faster, 2) Do More, 3) Integrate Everything, and 4) Make Everything Good. These needs are often impossibly paradoxical. But we won’t stop trying our best. Our team’s accomplishments this week feel like a positive step in the right direction.

Newly Available DDR Collections in Sept 2016

Web Interfaces for our Audiovisual Collections

Audiovisual materials account for a significant portion of Duke’s Digital Collections. All told, we now have over 3,400 hours of A/V content accessible online, spread over 14,000 audio and video files discoverable in various platforms. We’ve made several strides in recent years introducing impactful collections of recordings like H. Lee Waters Films, the Jazz Loft Project Records, and Behind the Veil: Documenting African American Life in the Jim Crow South. This spring, the Duke Chapel Recordings collection (including over 1,400 recordings) became our first A/V collection developed in the emerging Duke Digital Repository platform. Completing this first phase of the collection required some initial development for A/V interfaces, and it’ll keep us on our toes to do more as the project progresses through 2019.

A video recording in the Duke Chapel Recordings collection.
A video interface in the Duke Chapel Recordings collection.

Preparing A/V for Access Online

When digitizing audio or video, our diligent Digital Production Center staff create a master file for digital preservation, and from that, a single derivative copy that’s smaller and appropriately compressed for public consumption on the web. The derivative files we create are compressed enough that they can be reliably pseudo-streamed (a.k.a. “progressive download”) to a user over HTTP in chunks (“byte ranges”) as they watch or listen. We are not currently using a streaming media server.

Here’s what’s typical for these files:

  • Audio. MP3 format, 128kbps bitrate. ~1MB/minute.
  • Video. MPEG4 (.mp4) wrapper files. ~17MB/minute or 1GB/hour.
    The video track is encoded as H.264 at about 2,300 kbps; 640×480 for standard 4:3.
    The audio track is AAC-encoded at 160kbps.

These specs are also consistent with what we request of external vendors in cases where we outsource digitization.

The A/V Player Interface: JWPlayer

Since 2014, we have used a local instance of JWPlayer as our A/V player of choice for digital collections. JWPlayer bills itself as “The Most Popular Video Player & Platform on the Web.” It plays media directly in the browser by using standard HTML5 video specifications (supported for most intents & purposes now by all modern browsers).

We like JWPlayer because it’s well-documented, and easy to customize with a robust Javascript API to hook into it. Its developers do a nice job tracking browser support for all HTML5 video features, and they design their software with smart fallbacks to look and function consistently no matter what combo of browser & OS a user might have.

In the Duke Digital Repository and our archival finding aids, we’re now using the latest version of JWPlayer. It’s got a modern, flat aesthetic and is styled to match our color palette.

JW Player displaying inline video for the Jazz Loft Project Records collection guide.

Playlists

Here’s an area where we extended the new JWPlayer with some local development to enhance the UI. When we have a playlist—that is, a recording that is made up of more than one MP3 or MP4 file—we wanted a clearer way for users to navigate between the files than what comes out of the box. It was fairly easy to create some navigational links under the player that indicate how many files are in the playlist and which is currently playing.

A multi-part audio item from Duke Chapel Recordings.
A multi-part audio item from Duke Chapel Recordings.

Captions & Transcripts

Work is now underway (by three students in the Duke Divinity School) to create timed transcripts of all the sermons given within the recorded services included in the Duke Chapel Recordings project.

We contracted through Popup Archive for computer-generated transcripts as a starting point. Those are about 80% accurate, but Popup provides a really nice interface for editing and refining the automated text before exporting it to its ultimate destination.

Caption editing interface provided by Popup Archive
Caption editing interface provided by Popup Archive

One of the most interesting aspects of HTML5 <video> is the <track> element, wherein you can associate as many files of captions, subtitles, descriptions, or chapter information as needed.  Track files are encoded as WebVTT; so we’ll use WebVTT files for the transcripts once complete. We’ll also likely capture the start of a sermon within a recording as a WebVTT chapter marker to provide easier navigation to the part of the recording that’s the most likely point of interest.

JWPlayer displays WebVTT captions (and chapter markers, too!). The captions will be wonderful for accessibility (especially for people with hearing disabilities); they can be toggled on/off within the media player window. We’ll also be able to use the captions to display an interactive searchable transcript on the page near the player (see this example using Javascript to parse the WebVTT). Our friends at NCSU Libraries have also shared some great work parsing WebVTT (using Ruby) for interactive transcripts.

The Future

We have a few years until the completion of the Duke Chapel Recordings project. Along the way, we expect to:

  • add closed captions to the A/V
  • create an interactive transcript viewer from the captions
  • work those captions back into the index to aid discovery
  • add a still-image extract from each video to use as a thumbnail and “poster frame” image
  • offer up much more A/V content in the Duke Digital Repository

Stay tuned!

Perplexed by Context? Slick Sticky Titles Skip the Toll of the Scroll

We have a few new exciting enhancements within our digital collections and archival collection guide interfaces to share this week, all related to the challenge of presenting the proper archival context for materials represented online. This is an enormous problem we’ve previously outlined on this blog, both in terms of reconciling different descriptions of the same things (in multiple metadata formats/systems)  and in terms of providing researchers with a clear indication of how a digitized item’s physical counterpart is arranged and described within its source archival collection.

Here are the new features:

View Item in Context Link

Our new digital collections (the ones in the Duke Digital Repository) have included a prominent link (under header “Source Collection”) from a digitized item to its source archival collection with some snippets of info from the collection guide presented in a popover. This was an important step toward connecting the dots, but still only gets someone to the top of the collection guide; from there, researchers are left on their own for figuring out where in the collection an item resides.

Archival source collection info presented for an item in the W. Duke & Sons collection.
Archival source collection info presented for an item in the W. Duke & Sons collection.

Beginning with this week’s newly-available Alex Harris Photographs Collection (and also the Benjamin & Julia Stockton Rush Papers Collection), we take it another step forward and present a deep link directly to the row in the collection guide that describes the item. For now, this link says “View Item in Context.”

A deep link to View Item in Context for an item in the Alex Harris Photographs Collection
A deep link to View Item in Context for an item in the Alex Harris Photographs Collection

This linkage is powered by indicating an ArchivesSpace ID in a digital object’s administrative metadata; it can be the ID for a series, subseries, folder, or item title, so we’re flexible in how granular the connection is between the digital object and its archival description.

Sticky Title & Series Info

Our archival collection guides are currently rendered as single webpages broken into sections. Larger collections make for long webpages. Sometimes they’re really super long. Where the contents of the collection are listed, there’s a visual hierarchy in place with nested descriptions of series, subseries, etc. but it’s still difficult to navigate around and simultaneously understand what it is you’re viewing. The physical tedium of scrolling and the cognitive load required to connect related descriptive information located far away on a page make for bad usability.

As of last week, we now we keep the title of the collection “stuck” to the top of the screen once you’re no longer viewing the top of the page (it also functions as a link to get back to the top). And even more helpful is a new sticky series header that links to the beginning of the archival series within which the currently visible items were arranged; there’s usually an important description up there that helps contextualize the items listed below. This sticky header is context-aware, meaning it follows you around like a loyal companion, updating itself perpetually to reflect where you are as you navigate up or down.

Title & series information "stuck" to the top of a collection guide.
Title & series information “stuck” to the top of a collection guide.

This feature is powered via the excellent Bootstrap Scrollspy Javascript utility combined with some custom styling.

All Series Browser

To give researchers easier browsing between different archival series in a collection, we added a link in the sticky header to browse “All Series.” That link pops down a menu to jump directly to the start of each series within the collection guide.

All Series

Direct Links to Anything

Researchers can now easily get a link to any row in a collection guide where the contents are described. This can be anything: a series, subseries, folder, or item. It’s simple—just mouseover the row, click the arrow that appears at the left, and copy the URL from the address bar. The row in the collection guide that’s the target of that link gets highlighted in green.

Click the arrow to link directly to a row within the collection guide.
Click the arrow to link directly to a row within the collection guide.

We would love to get feedback on these features to learn whether they’re helpful and see how we might enhance or adjust them going forward. Try them out and let us know what you think!

Special thanks to our metadata gurus Noah Huffman and Maggie Dickson for their contributions on these features.

Zoomable Hi-Res Images: Hopping Aboard the OpenSeadragon Bandwagon

Our new W. Duke & Sons digital collection (released a month ago) stands as an important milestone for us: our first collection constructed in the (Hydra-based) Duke Digital Repository, which is built on a suite of community-built open source software. Among that software is a remarkable image viewer tool called OpenSeadragon. Its website describes it as:

“an open-source, web-based viewer for high-resolution zoomable images, implemented in pure Javascript, for desktop and mobile.”

OpenSeadragon viewer in action on W. Duke & Sons collection.
OpenSeadragon viewer in action on W. Duke & Sons collection.
OpenSeadragon zoomed in, W. Duke & Sons collection.
OpenSeadragon zoomed in, W. Duke & Sons collection.

In concert with tiled digital images (we use Pyramid TIFFs), an image server (IIPImage), and a standard image data model (IIIF: International Image Interoperability Framework), OpenSeadragon considerably elevates the experience of viewing our image collections online. Its greatest virtues include:

  • smooth, continuous zooming and panning for high-resolution images
  • open source, built on web standards
  • extensible and well-documented

We can’t wait to get to share more of our image collections in the new platform.

OpenSeadragon Examples Elsewhere

Arthur C. Clarke’s Third Law states, “Any sufficiently advanced technology is indistinguishable from magic.” And looking at high-res images in OpenSeadragon feels pretty darn magical. Here are some of my favorite implementations from places that inspired us to use it:

  1. The Metropolitan Museum of Art. Zooming in close on this van Gogh self-portrait gives you a means to inspect the intense brushstrokes and texture of the canvas in a way that you couldn’t otherwise experience, even by visiting the museum in-person.

    Self-Portrait with a Straw Hat (obverse: The Potato Peeler). Vincent van Gogh, 1887.
    Self-Portrait with a Straw Hat (obverse: The Potato Peeler). Vincent van Gogh, 1887.
  2. Chronicling America: Historic American Newspapers (Library of Congress). For instance, zoom to read in the July 21, 1871 issue of “The Sun” (New York City) about my great-great-grandfather George Aery’s conquest being crowned the Schuetzen King, sharpshooting champion, at a popular annual festival of marksmen.
    The sun. (New York [N.Y.]), 21 July 1871. Chronicling America: Historic American Newspapers. Lib. of Congress.
    The sun. (New York [N.Y.]), 21 July 1871. Chronicling America: Historic American Newspapers. Lib. of Congress.
  3. Other GLAMs. See these other nice examples from The National Gallery of Art, The Smithsonian National Museum of American Museum, NYPL Digital Collections, and Digital Public Library of America (DPLA).

OpenSeadragon’s Microsoft Origins

OpenSeadragon

The software began with a company called Sand Codex, founded in Princeton, NJ in 2003. By 2005, the company had moved to Seattle and changed its name to Seadragon Software. Microsoft acquired the company in 2006 and positioned Seadragon within Microsoft Live Labs.

In March 2007, Seadragon founder Blaise Agüera y Arcase gave a TED Talk where he showcased the power of continuous multi-resolution deep-zooming for applications built on Seadragon. In the months that followed, we held a well-attended staff event at Duke Libraries to watch the talk. There was a lot of ooh-ing and aah-ing. Indeed, it looked like magic. But while it did foretell a real future for our image collections, at the time it felt unattainable and impractical for our needs. It was a Microsoft thing. It required special software to view. It wasn’t going to happen here, not when we were making a commitment to move away from proprietary platforms and plugins.

Sometime in 2008, Microsoft developed a more open Javascript-based version of Seadragon called Seadragon Ajax, and by 2009 had shared it as open-source software via a New BSD license.  That curtailed many barriers for use, however it still required a Microsoft server-side framework and Microsoft AJAX library.  So in the years since, the software has been re-engineered to be truly open, framework-agnostic, and has thus been rebranded as OpenSeadragon. Having a technology that’s this advanced–and so useful–be so open has been an incredible boon to cultural heritage institutions and, by extension, to the patrons we serve.

Setup

OpenSeadragon’s documentation is thorough, so that helped us get up and running quickly with adding and customizing features. W. Duke & Sons cards were scanned front & back, and the albums are paginated, so we knew we had to support navigation within multi-image items. These are the key features involved:

Customizations

Some aspects of the interface weren’t quite as we needed them to be out-of-the-box, so we added and customized a few features.

  • Custom Button Binding. Created our own navigation menu to match our site’s more modern aesthetic.
  • Page Indicator / Jump to Page. Developed a page indicator and direct-input page jump box using the OpenSeadragon API
  • Styling. Revised the look & feel with additional CSS & Javascript.

Future Directions: Page-Turning & IIIF

OpenSeadragon does have some limitations where we think that it alone won’t meet all our needs for image interfaces. When we have highly-structured paginated items with associated transcriptions or annotations, we’ll need to implement something a bit more complex. Mirador (example) and Universal Viewer (example) are two example open-source page-viewer tools that are built on top of OpenSeadragon. Both projects depend on “manifests” using the IIIF presentation API to model this additional data.

The Hydra Page Turner Interest Group recently produced a summary report that compares these page-viewer tools and features, and highlights strategies for creating the multi-image IIIF manifests they rely upon. Several Hydra partners are already off and running; at Duke we still have some additional research and development to do in this area.

We’ll be adding many more image collections in the coming months, including migrating all of our existing ones that predated our new platform. Exciting times lie ahead. Stay tuned.

Animated Demo

eye-ui-demo-4