Category Archives: User Experience

Conferences, Digital Collections, Duke Digital Repository, Technology, User Experience

Accessible AV in the Duke Digital Repository

October 24, 2017 Sean Aery 1 Comment

Over the course of 2017, we improved our capacity to support digital audiovisual materials in the Duke Digital Repository (DDR) by leaps and bounds. A little more than a year ago, I had written a Bitstreams blog post highlighting the new features we had just developed in the DDR to provide basic functionality for AV, especially in support of the Duke Chapel Recordings collection. What a difference a year makes.

This past year brought renewed focus on AV development, as we worked to bring the NEH grant-funded Radio Haiti Archive online (launched in June). At the same time, our digital collections legacy platform migration efforts shifted toward moving our existing high-profile digital AV material into the repository.

Closed Captions

At Duke University Libraries, we take accessibility seriously. We aim to include captions or transcripts for the audiovisual objects made available via the Duke Digital Repository, especially to ensure that the materials can be perceived and navigated by people with disabilities. For instance, work is well underway to create closed captions for all 1,400 items in the Duke Chapel Recordings project.

Screenshot showing Charmin commercial from AdViews collection with caption overlay — Captioned video displays a CC button and shows captions as an overlay in the video player. Example from the AdViews collection, coming soon to the DDR.

The DDR now accommodates modeling and ingest for caption files, and our AV player interface (powered by JW Player) presents a CC button whenever a caption file is available. Caption files are encoded using WebVTT, the modern W3C standard for associating timed text with HTML audio and video. WebVTT is structured so as to be machine-processable, while remaining lightweight enough to be reasonably read, created, or edited by a person. It’s a format that transcription vendors can provide. And given its endorsement by W3C, it should be a viable captioning format for a wide range of applications and devices for the foreseeable future.

Example WebVTT captions — Text cues from a WebVTT caption file for an audio item in the Duke Chapel Recordings collection.

Interactive Transcripts

Displaying captions within the player UI is helpful, but it only gets us so far. For one, that doesn’t give a user a way to just read the caption text without requiring them to play the media. We also need to support captions for audio files, but unlike with video, the audio player doesn’t include enough real estate within itself to render the captions. There’s no room for them to appear.

So for both audio and video, our solution is to convert the WebVTT caption files on-the-fly into an interactive in-page transcript. Using the webvtt-ruby gem (developed by Coconut) , we parse the WebVTT text cues into Ruby objects, then render them back on the page as HTML. We then use the JWPlayer Javascript API to keep the media player and the HTML transcript in sync. Clicking on a transcript cue advances the player to the corresponding moment in the media, and the currently-playing cue gets highlighted as the media plays.

Screenshot of interactive audio transcript — Example interactive synchronized transcript for an audio item (rendered from a WebVTT caption file). From a collection coming soon to the DDR.

We also do some extra formatting when the WebVTT cues include voice tags (<v> tags), which can optionally indicate the name of the speaker (e.g., <v Jane Smith>). The in-page transcript is indexed by Google for search retrieval.

Transcript Documents

In many cases, especially for audio items, we may have only a PDF or other type of document with a transcript of a recording that isn’t structured or time-coded. Like captions, these documents are important for accessibility. We have developed support for displaying links to these documents near the media player. Look for some new collections using this feature to become available in early 2018.

Screenshot of a transcript document menu above the AV player — Transcript documents presented above the media player. Coming soon to AV collections in the DDR.

A/V Embedding

The DDR web interface provides an optimal viewing or listening experience for AV, but we also want to make it easy to present objects from the DDR on other websites, too. When used on other sites, we’d like the objects to include some metadata, a link to the DDR page, and proper attribution. To that end, we now have copyable <iframe> embed code available from the Share menu for AV items.

Embed code in the Share menu for an audio item. — Copyable embed code from an audio recording in the Radio Haiti Archive.

This embed code is also what we now use within the Rubenstein Library collection guides (finding aids) interface: it lets us present digital objects from the DDR directly from within a corresponding collection guide. So as a researcher browses the inventory of a physical archival collection, they can play the media inline without having to leave.

Screenshot of Rubenstein Library collection guide presenting a Duke Chapel Recordings video inline. — Embedded view of a DDR video from the Duke Chapel Recordings collection presented inline in a Rubenstein Library archival collection guide.

Sites@Duke Integration
If your website or blog is one of the thousands of WordPress sites hosted and supported by Sites@Duke — a service of Duke’s Office of Information Technology (OIT) — we have good news for you. You can now embed objects from the DDR using WordPress shortcode. Sites@Duke, like many content management systems, doesn’t allow authors to enter <iframe> tags, so shortcode is the only way to get embeddable media to render.

Example of WordPress shortcode for DDR embedding on Sites@Duke.edu sites. — Sites@Duke WordPress sites can embed DDR media by using shortcode with the DDR item’s permalink.

And More!

Here are the other AV-related features we have been able to develop in 2017:

Access control: master files & derivatives alike can be protected so access is limited to only authorized users/groups
Video thumbnail images: model, manage, and display
Video poster frames: model, manage, and display
Intermediate/mezzanine files: model and manage
Rights display: display icons and info from RightsStatements.org and Creative Commons, so it’s clear what users are permitted to do with media.

What’s Next

We look forward to sharing our recent AV development with our peers at the upcoming Samvera Connect conference (Nov 6-9, 2017 in Evanston, IL). Here’s our poster summarizing the work to date:

Poster presentation screenshot for Samvera Connect 2017 — Poster about Duke’s AV development for Samvera Connect conference, Nov 6-9, 2017 (Evanston, IL)

Looking ahead to the next couple months, we aim to round out the year by completing a few more AV-related features, most notably:

Export WebVTT captions as PDF or .txt
Advance the player via linked timecodes in the description field in an item’s metadata
Improve workflows for uploading caption files and transcript documents

Now that these features are in place, we’ll be sharing a bunch of great new AV collections soon!

Behind the Scenes, Duke Digital Repository, Technology, User Experience

Nested Folders of Files in the Duke Digital Repository

August 4, 2017 Cory Lown

Born digital archival material present unique challenges to representation, access, and discovery in the DDR. A hard drive arrives at the archives and we want to preserve and provide access to the files. In addition to the content of the files, it’s often important to preserve to some degree the organization of the material on the hard drive in nested directories.

One challenge to representing complex inter-object relationships in the repository is the repository’s relatively simple object model. A collection contains one or more items. An item contains one or more components. And a component has one or more data streams. There’s no accommodation in this model for complex groups and hierarchies of items. We tend to talk about this as a limitation, but it also makes it possible to provide search and discovery of a wide range of kinds and arrangements of materials in a single repository and forces us to make decisions about how to model collections in sustainable and consistent ways. But we still need to preserve and provide access to the original structure of the material.

One approach is to ingest the disk image or a zip archive of the directories and files and store the content as a single file in the repository. This approach is straightforward, but makes it impossible to search for individual files in the repository or to understand much about the content without first downloading and unarchiving it.

As a first pass at solving this problem of how to preserve and represent files in nested directories in the DDR we’ve taken a two-pronged approach. We will use a simple approach to modeling disk image and directory content in the repository. Every file is modeled in the repository as an item with a single component that contains the data stream of the file. This provides convenient discovery and access to each individual file from the collection in the DDR, but does not represent any folder hierarchies. The files are just a flat list of objects contained by a collection.

To preserve and store information about the structure of the files we add an XML METS structMap as metadata on the collection. In addition we store on each item a metadata field that stores the complete original file path of the file.

Below is a small sample of the kind of structural metadata that encodes the nested folder information on the collection. It encodes the structure and nesting, directory names (in the LABEL attribute), the order of files and directories, as well as the identifiers for each of the files/items in the collection.

<?xml version="1.0"?>
<mets xmlns="http://www.loc.gov/METS/" xmlns:xlink="http://www.w3.org/1999/xlink">
  <metsHdr>
    <agent ROLE="CREATOR">
      <name>REPOSITORY DEFAULT</name>
    </agent>
  </metsHdr>
  <structMap TYPE="default">
    <div LABEL="2017-0040" ORDER="1" TYPE="Directory">
      <div ORDER="1">
        <mptr LOCTYPE="ARK" xlink:href="ark:/99999/fk42j6qc37"/>
      </div>
      <div LABEL="RL11405-LFF-0001_Programs" ORDER="2" TYPE="Directory">
        <div ORDER="1">
          <mptr LOCTYPE="ARK" xlink:href="ark:/99999/fk4j67r45s"/>
        </div>
        <div ORDER="2">
          <mptr LOCTYPE="ARK" xlink:href="ark:/99999/fk4d50x529"/>
        </div>
        <div ORDER="3">
          <mptr LOCTYPE="ARK" xlink:href="ark:/99999/fk4086jd3r"/>
        </div>
      </div>
      <div LABEL="RL11405-LFF-0002_H1_Early-Records-of-Decentralization-Conference" ORDER="3" TYPE="Directory">
        <div ORDER="1">
          <mptr LOCTYPE="ARK" xlink:href="ark:/99999/fk4697f56f"/>
        </div>
        <div ORDER="2">
          <mptr LOCTYPE="ARK" xlink:href="ark:/99999/fk45h7t22s"/>
        </div>
      </div>
    </div>
  </structMap>
</mets>

Combining the 1:1 (item:component) object model with structural metadata that preserves the original directory structure of the files on the file system enables us to display a user interface that reflects the original structure of the content even though the structure of the items in the repository is flat.

There’s more to it of course. We had to develop a new ingest process that could take as its starting point a file path and then crawl it and its subdirectories to ingest files and construct the necessary structural metadata.

On the UI end of things a nifty Javascript plugin called jsTree powers the interactive directory structure display on the collection page.

Because some of the collections are very large and loading a directory tree structure of 100,000 or more items would be very slow, we implemented a small web service in the application that loads the jsTree data only when someone clicks to open a directory in the interface.

The file paths are also keyword searchable from within the public interface. So if a file is contained in a directory named “kitchen/fruits/bananas/this-banana.txt” you would be able to find the file this-banana.txt by searching for “kitchen” or “fruit” or “banana.”

This new functionality to ingest, preserve, and represent files in nested folder structures in the Duke Digital Repository will be included in the September release of the Duke Digital Repository.

Behind the Scenes, Duke Digital Repository, Technology, User Experience

Turning on the Rights in the Duke Digital Repository

June 30, 2017 Sean Aery 3 Comments

As 2017 reaches its halfway point, we have concluded another busy quarter of development on the Duke Digital Repository (DDR). We have several new features to share, and one we’re particularly delighted to introduce is Rights display.

Back in March, my colleague Maggie Dickson shared our plans for rights management in the DDR, a strategy built upon using rights status URIs from RightsStatements.org, and in a similar fashion, licenses from Creative Commons. In some cases, we supplement the status with free text in a local Rights Note property. Our implementation goals here were two-fold: 1) use standard statuses that are machine-readable; 2) display them in an easily understood manner to users.

New rights display feature in action on a digital object.

What to Display

Getting and assigning machine-readable URIs for Rights is a significant milestone in its own right. Using that value to power a display that makes sense to users is the next logical step. So, how do we make it clear to a user what they can or can’t do with a resource they have discovered? While we could simply display the URI and link to its webpage (e.g., http://rightsstatements.org/vocab/InC-EDU/1.0/ ) the key info still remains a click away. Alternatively, we could display the rights statement or license title with the link, but some of them aren’t exactly intuitive or easy on the eyes. “Attribution-NonCommercial-NoDerivatives 4.0 International,” anyone?

Inspiration

Looking around to see how other cultural heritage institutions have solved this problem led us to very few examples. RightsStatements.org is still fairly new and it takes time for good design patterns to emerge. However, Europeana — co-champion of the RightsStatements.org initiative along with DPLA — has a stellar collections site, and, as it turns out, a wonderfully effective design for displaying rights statuses to users. Our solution ended up very much inspired by theirs; hats off to the Europeana team.

Image from Europeana site. — Europeana Collections UI.

Icons

Both Creative Commons and RightsStatements.org provide downloadable icons at their sites (here and here). We opted to store a local copy of the circular SVG versions for both to render in our UI. They’re easily styled, they don’t take up a lot of space, and used together, they have some nice visual unity.

Labels & Titles

We have a lightweight Rails app with an easy-to-use administrative UI for managing auxiliary content for the DDR, so that made a good home for our rights statuses and associated text. Statements are modeled to have a URI and Title, but can also have three additional optional fields: short title, re-use text, and an array of icon classes.

Editing rights info associated with each statement.

Displaying the Info

We wanted to be sure to show the rights status in the flow of the rest of an object’s metadata. We also wanted to emphasize this information for anyone looking to download a digital object. So we decided to render the rights status prominently in the download menu, too.

Rights status in download menu — Rights status displays in the download menu.

Rights status also displays alongside other metadata.

What’s Next

Our focus in this area now shifts toward applying these newly available rights statuses to our existing digital objects in the repository, while ensuring that new ingests/deposits get assessed and assigned appropriate values. We’ll also have opportunities to refine where and how the statuses get displayed. We stand to learn a lot from our peer organizations implementing their own rights management strategies, and from our visitors as they use this new feature on our site. There’s a lot of work ahead, but we’re thrilled to have reached this noteworthy milestone.

Behind the Scenes, Duke Digital Repository, User Experience

Rights Management and the Duke Digital Repository

March 10, 2017 Maggie Dickson

Last spring, we were awfully excited to see the DPLA/Europeana release of RightStatements.org, a suite of standardized rights statements for describing the copyright and re-use status of digital resources. We have never had a comprehensive approach towards rights management for the Duke Digital Repository, but with the release of RightsStatements.org, we now feel we are equipped to wrestle that beast.

Managing and communicating rights statuses for digital collections has long been a challenge for us. The DDR currently allows for the application and display of Creative Commons licenses, which can be used for situations where the copyright holders themselves can assert the rights statuses for their own resources. RightsStatements.org fills a giant gap for us, in that it allow us to assign machine-readable rights to repository resources for which we know something about the rights status but do not hold the copyrights for. Additionally, these statements accommodate for the often fluid and ambiguous nature of copyrights for cultural heritage materials.

So, it’s been nearly a year since the statements were published, and during that time a community best practice has started to develop. The approach we have decided on for rights management in the Duke Digital Repository follows this emerging best practice, and involves using one field – Dublin Core Rights, as that is the metadata standard our repository uses – to store either a Creative Commons or RightsStatements.org URI, and nothing but that URI, and another field – a local property which we are calling ‘Rights Note’ – to store free text contextual information relating to the rights status of the resource (as long as it’s not in conflict with rights statement applied). Having machine-processable rights statuses means we will have a much better rights management strategy (we don’t currently have a way to report on the rights status of repository materials), as well as the ability to clearly communicate to users what they can and cannot do with resources they find.

Now that we’ve got a strategy for doing rights management, however, we need to develop a strategy for implementing it. We’ll tackle the low-hanging fruit first – collections that have a single, identifiable creator or for which the date ranges put them into the public domain – and then move on to the trickier stuff – for example, collections representing multiple or unidentified creators. Digital collections of archival materials present especially difficult challenges, as the the repository ‘itemness’ is frequently at the folder-level, meaning that the ‘item’, in these cases, might contain works by multiple creators of varying rights statuses (think of a folder of correspondence, for example).

The good news is, there are a lot of smart people working on addressing these challenges. Laura Capell and Elliott Williams of the University of Miami published a helpful poster, Assigning Rights Statements to Legacy Digital Collections describing the the decision matrix they developed to help them apply rights statements to their digital collections, and as I was writing this blog post, the Society of American Archivists circulated their Guide to Implementing Rights Statements from RightsStatements.org (nice timing, SAA!). I’m hoping to find some good nuggets of wisdom in its pages. We feel especially well-positioned to tackle rights management here at Duke, as Dave Hansen, who was deeply involved in the development of RightsStatements.org, joined us as our Director of Copyright and Scholarly Communications last year. We’d love to hear from other organizations as they develop their own local implementations – we know we’re not in this alone!

Technology, Uncategorized, User Experience

508 Update, Update

February 10, 2017 Michael Daul

A little more than a year ago, I wrote about the proposed update to the 508 accessibility standards. And about three weeks ago, the US Access Board published the final rule that contains updates to the 508 accessibility requirements for Information and Communication Technology (ICT). The rules had not previously been updated since 2001 and as such had greatly lagged behind modern web conventions.

It’s important to note that the 508 guidelines are intended to serve as a vehicle for guiding procurement, while at the same time applying to content created by a given group/agency. As such, the language isn’t always straightforward.

What’s new?

As I outlined in my previous post, a major purpose of the new rule is to move away from regulating types of devices and instead focus on functionality:

… one of the primary purposes of the final rule is to replace the current product-based approach with requirements based on functionality, and, thereby, ensure that accessibility for people with disabilities keeps pace with advances in ICT.

To that effect, one of the biggest change over the old standard is the adoption of WCAG 2.0 as the compliance level. The fundamental premise of WCAG compliance is that content is ‘perceivable, operable, and understandable’ — bottom line is that as developers, we should strive to make sure all of our content is usable for everyone across all devices. The adoption of WCAG allows the board to offload responsibility of making incremental changes as technology advances (so we don’t have to wait another 15 years for updates) and also aligns our standards in the United States with those used around the world.

Harmonization with international standards and guidelines creates a larger marketplace for accessibility solutions, thereby attracting more offerings and increasing the likelihood of commercial availability of accessible ICT options.

Another change has to do with making a wider variety of electronic content accessible, including internal documents. It will be interesting to see to what degree this part of the rule is followed by non-federal agencies.

The Revised 508 Standards specify that all types of public-facing content, as well as nine categories of non-public-facing content that communicate agency official business, have to be accessible, with “content” encompassing all forms of electronic information and data. The existing standards require Federal agencies to make electronic information and data accessible, but do not delineate clearly the scope of covered information and data. As a result, document accessibility has been inconsistent across Federal agencies. By focusing on public-facing content and certain types of agency official communications that are not public facing, the revised requirements bring needed clarity to the scope of electronic content covered by the 508 Standards and, thereby, help Federal agencies make electronic content accessible more consistently.

The new rules do not go into effect until January 2018. There’s also a ‘safe harbor’ clause that protects content that was created before this enforcement date, assuming it was in compliance with the old rules. However, if you update that content after January, you’ll need to make sure it complies with the new final rule.

Existing ICT, including content, that meets the original 508 Standards does not have to be upgraded to meet the refreshed standards unless it is altered. This “safe harbor” clause (E202.2) applies to any component or portion of ICT that complies with the existing 508 Standards and is not altered. Any component or portion of existing, compliant ICT that is altered after the compliance date (January 18, 2018) must conform to the updated 508 Standards.

So long story short, a year from now you should make sure all the content you’re creating meets the new compliance level.

Projects, Technology, User Experience

A Refreshing New Look for Our Library Website

February 3, 2017 Sean Aery

If you’ve visited the Duke University Libraries website in the past month, you may have noticed that it looks a bit more polished than it used to. Over the course of the fall 2016 semester, my talented colleague Michael Daul and I co-led a project to develop and implement a new theme for the site. We flipped the switch to launch the theme on January 6, 2017, the week before spring classes began. In this post, I’ll share some background on the project and its process, and highlight some noteworthy features of the new theme we put in place.

Newly refreshed Duke University Libraries website homepage.

Goals

We kicked off the project in Aug 2016 using the title “Website Refresh” (hat-tip to our friends at NC State Libraries for coining that term). The best way to frame it was not as a “redesign,” but more like a 50,000-mile maintenance tuneup for the site. We had four main goals:

Extend the Life of our current site (in Drupal 7) without a major redesign or redevelopment effort
Refresh the Look of the site to be modern but not drastically different
Better Code by streamlining HTML markup & CSS style code for easier management & flexibility
Enhance Accessibility via improved compliance with WCAG accessibility guidelines

Our site is fairly large and complex (1,200+ pages, for starters). So to keep the scope lean, we included no changes in content, information architecture, or platform (i.e., stayed on Drupal 7). We also worked with a lean stakeholder team to make decisions related to aesthetics.

Extending the Life of the Site

Our old website theme was aging; the project leading to its development began five years ago in Sep 2012, was announced in Jan 2013, and then eventually launched about three years ago in Jan 2014. Five years–and even three–is a long time in web years. Sites accumulate a lot of code cruft over time, the tools for managing and writing code become deprecated quickly. We wanted to invest a little time now to replace some pieces of the site’s front-end architecture with newer and better replacements, in order to buy us more time before we’d have to do an expensive full-scale overhaul from the ground up.

Refreshing the Look

Our 2014 site derived a lot its aesthetic from the main Duke.edu website at the time. Duke’s site has changed significantly since then, and meanwhile, web design trends have changed dramatically: flat design is in, skeuomorphism out. Google Web Fonts are in, Times, Arial, Verdana and company are out. Even a three year old site on the web can look quite dated.

New “refreshed” theme, with flatter, more modern aesthetic

Closeup on skeuomorphic embellishments vs. flat elements.

Better Code

Beyond evolving aesthetics, the various behind-the-scenes web frameworks and code workflows are in constant, rapid flux; it can really keep a developer’s head on a swivel. Better code means easier maintenance, and to that end our code got a lot better after implementing these solutions:

Bootstrap Upgrade. For our site’s HTML/CSS/JS framework, we moved from Bootstrap version 2 (2.3.1) to version 3 (3.3.7). This took weeks of work: it meant thousands of pages of markup revisions, only some of which could be done with a global Search & Replace.
Sass for CSS. We trashed all of our old theme’s CSS files and started over using Sass, a far more efficient way to express and maintain style rules than vanilla CSS.
Gulp for Automation. Our new theme uses Gulp to automate code tasks like processing Sass into CSS, auto-prefixing style declarations to work on older browsers, and crunching 30+ css files down into one.
Font Awesome. We ditched most of our older image-based icons in favor of Font Awesome ones, which are far easier to reference and style, and faster to load.
Radix. This was an incredibly useful base theme for Drupal that encapsulates/integrates Sass, Gulp, Bootstrap, and FontAwesome. It also helped us get a Bootswatch starter theme in the mix to minimize the local styling we had to do on top of Bootstrap.

We named our new theme Dulcet and put it up on GitHub.

Sass for style management, e.g., expressing colors as reusable variables.

Gulp for task automation, e.g., auto-prefixing styles to account for older browser workarounds.

Accessibility

Some of the code and typography revisions we’ve made in the “refresh” improve our site’s compliance with WCAG2.0 accessibility guidelines. We’re actively working on further assessment and development in this area. Our new theme is better suited to integrate with existing tools, e.g., to automatically add ARIA attributes to interactive page elements.

Feedback or Questions?

We would love to hear from you if you have any feedback on our new site, if you spot any oddities, or if you’re considering doing a similar project and have any questions. We encourage you to explore the site, and hope you find it a refreshing experience.

Projects, Technology, User Experience

Typography (and the Web)

July 8, 2016 Michael Daul 1 Comment

This summer I’ve been working, or at least thinking about working, on a couple of website design refresh projects. And along those lines, I’ve been thinking a lot about typography. I think it’s fair to say that the overwhelming majority of content that is consumed across the Web is text-based (despite the ever-increasing rise of infographics and multimedia). As such, typography should be considered one of the most important design elements that users will experience when interacting with a website.

CIT Site — An early mockup of the soon-to-be-released CIT design refresh

Early on, Web designers were restricted to using certain ‘stacks’ of web-safe fonts that would hunt through the list of those available on a user’s computer until it found something compatible. Or worst-case, the page would default to using the most basic system ‘sans’ or ‘serif.’ So type design back then wasn’t very flexible and could certainly not be relied upon to render consistently across browsers or platforms. Which essentially resulted in most website text looking more or less the same. In 2004, some very smart people released sIFR which was a flashed-based font replacement technique. It ushered in a bit of a typography renaissance and allowed designers to include almost any typeface they desired into their work with the confidence that the overwhelming majority of users would see the same thing, thanks largely to the prevalence of the (now maligned) Flash plugin.

Right before Steve Jobs fired the initial shot that would ultimately lead to the demise flash, an additional font replacement technique, named Cufon, was released to the world. This approach used Scalable Vector Graphics and Javascript (instead of flash) and was almost universally compatible across browsers. Designers and developers were now very happy as they could use non-standard type faces in their work without relying on Flash.

More or less in parallel with the release of Cufon came the widespread adoption across browsers for the @font-face rule. This allowed developers to load fonts from a web server and have them render on a page, instead of relying on the local fonts a user had installed. In mid to late 2009, services like Typekit, League of Moveable Type, and Font Squirrel began to appear. Instead of outrightly selling licenses to fonts, Typekit worked on a subscription model and made various sets of fonts available for use both locally with design programs and for web publishing, depending on your membership type. [Adobe purchased Typekit in late 2011 and includes access to the service via their Creative Cloud platform.] LoMT and Font Squirrel curate freeware fonts and makes it easy to download the appropriate files and CSS code to integrate them into your site. Google released their font service in 2010 and it continues to get better and better. They launched an updated version a few weeks ago along with this promo video:

There are also many type foundries that make their work available for use on the web. A few of my favorite font retailers are FontShop, Emigre, and Monotype. The fonts available from these ‘premium’ shops typically involve a higher degree of sophistication, more variations of weight, and extra attention to detail — especially with regard to things like kerning, hinting, and ligatures. There are also many interesting features available in OpenType (a more modern file format for fonts) and they can be especially useful for adding diversity to the look of brush/script fonts. The premium typefaces usually incorporate them, whereas free fonts may not.

Modern web conventions are still struggling with some aspects of typography, especially when it comes to responsive design. There are many great arguments about which units we should be using (viewport, rem/em, px) and how they should be applied. There are calculators and libraries for adjusting things like size, line length, ratios, and so on. There are techniques to improve kerning. But I think we have yet to find a standard, all-in-one solution — there always seems to be something new and interesting available to explore, which pretty much underscores the state of Web development in general.

Here are some other excellent resources to check out:

I’ll conclude with one last recommendation — the Introduction to Typography class on Coursera. I took it for fun a few months ago. It seemed to me that the course is aimed at those who may not have much of a design background, so it’s easily digestible. The videos are informative, not overly complex, and concise. The projects were fun to work on and you end up getting to provide feedback on the work of your fellow classmates, which I think is always fun. If you have an hour or two available for four weeks in a row, check it out!

Digital Collections, Technology, User Experience

Web Interfaces for our Audiovisual Collections

June 10, 2016 Sean Aery 1 Comment

Audiovisual materials account for a significant portion of Duke’s Digital Collections. All told, we now have over 3,400 hours of A/V content accessible online, spread over 14,000 audio and video files discoverable in various platforms. We’ve made several strides in recent years introducing impactful collections of recordings like H. Lee Waters Films, the Jazz Loft Project Records, and Behind the Veil: Documenting African American Life in the Jim Crow South. This spring, the Duke Chapel Recordings collection (including over 1,400 recordings) became our first A/V collection developed in the emerging Duke Digital Repository platform. Completing this first phase of the collection required some initial development for A/V interfaces, and it’ll keep us on our toes to do more as the project progresses through 2019.

A video recording in the Duke Chapel Recordings collection. — A video interface in the Duke Chapel Recordings collection.

Preparing A/V for Access Online

When digitizing audio or video, our diligent Digital Production Center staff create a master file for digital preservation, and from that, a single derivative copy that’s smaller and appropriately compressed for public consumption on the web. The derivative files we create are compressed enough that they can be reliably pseudo-streamed (a.k.a. “progressive download”) to a user over HTTP in chunks (“byte ranges”) as they watch or listen. We are not currently using a streaming media server.

Here’s what’s typical for these files:

Audio. MP3 format, 128kbps bitrate. ~1MB/minute.
Video. MPEG4 (.mp4) wrapper files. ~17MB/minute or 1GB/hour.
The video track is encoded as H.264 at about 2,300 kbps; 640×480 for standard 4:3.
The audio track is AAC-encoded at 160kbps.

These specs are also consistent with what we request of external vendors in cases where we outsource digitization.

The A/V Player Interface: JWPlayer

Since 2014, we have used a local instance of JWPlayer as our A/V player of choice for digital collections. JWPlayer bills itself as “The Most Popular Video Player & Platform on the Web.” It plays media directly in the browser by using standard HTML5 video specifications (supported for most intents & purposes now by all modern browsers).

We like JWPlayer because it’s well-documented, and easy to customize with a robust Javascript API to hook into it. Its developers do a nice job tracking browser support for all HTML5 video features, and they design their software with smart fallbacks to look and function consistently no matter what combo of browser & OS a user might have.

In the Duke Digital Repository and our archival finding aids, we’re now using the latest version of JWPlayer. It’s got a modern, flat aesthetic and is styled to match our color palette.

JW Player displaying inline video for the Jazz Loft Project Records collection guide.

Playlists

Here’s an area where we extended the new JWPlayer with some local development to enhance the UI. When we have a playlist—that is, a recording that is made up of more than one MP3 or MP4 file—we wanted a clearer way for users to navigate between the files than what comes out of the box. It was fairly easy to create some navigational links under the player that indicate how many files are in the playlist and which is currently playing.

A multi-part audio item from Duke Chapel Recordings.

Captions & Transcripts

Work is now underway (by three students in the Duke Divinity School) to create timed transcripts of all the sermons given within the recorded services included in the Duke Chapel Recordings project.

We contracted through Popup Archive for computer-generated transcripts as a starting point. Those are about 80% accurate, but Popup provides a really nice interface for editing and refining the automated text before exporting it to its ultimate destination.

One of the most interesting aspects of HTML5 <video> is the <track> element, wherein you can associate as many files of captions, subtitles, descriptions, or chapter information as needed. Track files are encoded as WebVTT; so we’ll use WebVTT files for the transcripts once complete. We’ll also likely capture the start of a sermon within a recording as a WebVTT chapter marker to provide easier navigation to the part of the recording that’s the most likely point of interest.

JWPlayer displays WebVTT captions (and chapter markers, too!). The captions will be wonderful for accessibility (especially for people with hearing disabilities); they can be toggled on/off within the media player window. We’ll also be able to use the captions to display an interactive searchable transcript on the page near the player (see this example using Javascript to parse the WebVTT). Our friends at NCSU Libraries have also shared some great work parsing WebVTT (using Ruby) for interactive transcripts.

The Future

We have a few years until the completion of the Duke Chapel Recordings project. Along the way, we expect to:

add closed captions to the A/V
create an interactive transcript viewer from the captions
work those captions back into the index to aid discovery
add a still-image extract from each video to use as a thumbnail and “poster frame” image
offer up much more A/V content in the Duke Digital Repository

Stay tuned!

Behind the Scenes, Digital Collections, Technology, User Experience

EDTF-Humanize

April 22, 2016 Cory Lown

You might think that three posts about the Extended Date Time Format (EDTF) is three too many for one blog, but we who work with digital collections are very enthusiastic about dates. In two previous posts (Enjoy your Metadata: Fun with Date Encoding and It’s Date Night Here at Digital Projects and Production Services), Maggie and I (separately) wrote about our date metadata for digital collections and our reasons for migrating to EDTF.

To recap, EDTF is a machine readable date encoding standard that enables us to record dates with various levels of precision and certainty — important for cultural heritage collections.

One challenge of working with EDTF formatted dates is that it’s not necessarily obvious to humans what they mean. To the best of my knowledge, the available tools for working with EDTF dates are intended for parsing EDTF strings into objects that are understandable to programming languages. This is great for working with EDTF dates if you want to have the software do things like sort a list of items into date order or provide a searchable index of years. These tools are less helpful for outputting human readable versions of EDTF encoded dates, such as for an item’s metadata record display.

For instance, many of the photographs in the Alex Harris collection are dated with a season and year, such as “summer 1972.” EDTF specifies that “summer 1972” should be encoded as “1972-22.” This is great for our digital collections software, which knows what “1972-22” means (thanks to the EDTF gem). However, people unfamiliar with the EDTF standard will likely not understand what the date means.

In the metadata record for “Cabin Interior, Hickory Nut Gap, North Carolina” you can see that the date is stored as EDTF:

But in the metadata display in the digital repository’s public interface we want to display the date in a more human friendly format:

Because EDTF is machine readable it’s possible to create a set of rules for transforming the dates for display. These rules can get complicated so I wrote a Ruby Gem that masks some of the complexity and adds a humanize method to any EDTF date object. This makes it simple to transform any EDTF encoded date to a human readable string.

> Date.edtf('1972-22').humanize => "summer 1972"

Although more work could be done to make it more flexible, the gem is somewhat configurable. For instance, an uncertain date with year precision is encoded in EDTF as “1972~”. The humanize method will by default output this as “circa 1972”:

> Date.edtf('1972~').humanize => "circa 1972"

But if for some reason I wanted a different output for an uncertain date with year precision I could modify the edtf-humanize configurations:

> Edtf::Humanize.configuration.approximate_date_prefix = "approximately " => "approximately "

> Edtf::Humanize.configuration.year_precision_strftime_format = "%y" => "%y"

After changing these settings when I use the humanize method on the same EDTF date I get a different output:

> Date.edtf('1972~').humanize => "approximately 72"

The humanize method that edtf-humanize adds to EDTF objects makes it much easier to display EDTF encoded date metadata in configurable human friendly formats.

The edtf-humanize gem is available on GitHub and RubyGems.org so it can be included in any Rails project’s gemfile. It should be considered an early release and could use some enhancement for use cases beyond Duke’s Digital Repository where it was originally designed to be used.

Digital Collections, Technology, User Experience

Perplexed by Context? Slick Sticky Titles Skip the Toll of the Scroll

February 24, 2016 Sean Aery

We have a few new exciting enhancements within our digital collections and archival collection guide interfaces to share this week, all related to the challenge of presenting the proper archival context for materials represented online. This is an enormous problem we’ve previously outlined on this blog, both in terms of reconciling different descriptions of the same things (in multiple metadata formats/systems) and in terms of providing researchers with a clear indication of how a digitized item’s physical counterpart is arranged and described within its source archival collection.

Here are the new features:

View Item in Context Link

Our new digital collections (the ones in the Duke Digital Repository) have included a prominent link (under header “Source Collection”) from a digitized item to its source archival collection with some snippets of info from the collection guide presented in a popover. This was an important step toward connecting the dots, but still only gets someone to the top of the collection guide; from there, researchers are left on their own for figuring out where in the collection an item resides.

Archival source collection info presented for an item in the W. Duke & Sons collection.

Beginning with this week’s newly-available Alex Harris Photographs Collection (and also the Benjamin & Julia Stockton Rush Papers Collection), we take it another step forward and present a deep link directly to the row in the collection guide that describes the item. For now, this link says “View Item in Context.”

A deep link to View Item in Context for an item in the Alex Harris Photographs Collection

This linkage is powered by indicating an ArchivesSpace ID in a digital object’s administrative metadata; it can be the ID for a series, subseries, folder, or item title, so we’re flexible in how granular the connection is between the digital object and its archival description.

Sticky Title & Series Info

Our archival collection guides are currently rendered as single webpages broken into sections. Larger collections make for long webpages. Sometimes they’re really super long. Where the contents of the collection are listed, there’s a visual hierarchy in place with nested descriptions of series, subseries, etc. but it’s still difficult to navigate around and simultaneously understand what it is you’re viewing. The physical tedium of scrolling and the cognitive load required to connect related descriptive information located far away on a page make for bad usability.

As of last week, we now we keep the title of the collection “stuck” to the top of the screen once you’re no longer viewing the top of the page (it also functions as a link to get back to the top). And even more helpful is a new sticky series header that links to the beginning of the archival series within which the currently visible items were arranged; there’s usually an important description up there that helps contextualize the items listed below. This sticky header is context-aware, meaning it follows you around like a loyal companion, updating itself perpetually to reflect where you are as you navigate up or down.

Title & series information "stuck" to the top of a collection guide. — Title & series information “stuck” to the top of a collection guide.

This feature is powered via the excellent Bootstrap Scrollspy Javascript utility combined with some custom styling.

All Series Browser

To give researchers easier browsing between different archival series in a collection, we added a link in the sticky header to browse “All Series.” That link pops down a menu to jump directly to the start of each series within the collection guide.

Direct Links to Anything

Researchers can now easily get a link to any row in a collection guide where the contents are described. This can be anything: a series, subseries, folder, or item. It’s simple—just mouseover the row, click the arrow that appears at the left, and copy the URL from the address bar. The row in the collection guide that’s the target of that link gets highlighted in green.

Click the arrow to link directly to a row within the collection guide.

We would love to get feedback on these features to learn whether they’re helpful and see how we might enhance or adjust them going forward. Try them out and let us know what you think!

Special thanks to our metadata gurus Noah Huffman and Maggie Dickson for their contributions on these features.

Closed Captions

Interactive Transcripts

Transcript Documents

A/V Embedding

And More!

What’s Next

What to Display

Inspiration

Icons

Labels & Titles

Displaying the Info

What’s Next

What’s new?

Goals

Extending the Life of the Site

Refreshing the Look

Better Code

Accessibility

Feedback or Questions?

Preparing A/V for Access Online

The A/V Player Interface: JWPlayer

Playlists

Captions & Transcripts

The Future

View Item in Context Link

Sticky Title & Series Info

All Series Browser

Direct Links to Anything

Notes from the Duke University Libraries Digital Projects Team