Everyone knows that Twitter limits each post to 140 characters. Early criticism has since cooled and most people agree it’s a helpful constraint, circumvented through clever (some might say better) writing, hyperlinks, and URL-shorteners.  But as a reader of tweets, how do you know what lies at the other end of a shortened link? What entices you to click? The tweet author can rarely spare the characters to attribute the source site or provide a snippet of content, and can’t be expected to attach a representative image or screenshot.

Our webpages are much more than just mystery destinations for shortened URLs. Twitter agrees: its developers want help understanding what the share-worthy content from a webpage actually is in order to present it in a compelling way alongside the 140 characters or less.  Enter two library hallmarks: vocabularies and metadata.

This week, we added Twitter Card metadata in the <head> of all of our digital collections pages and in our library blogs. This data instantly made all tweets and retweets linking to our pages far more interesting. Check it out!

For the blogs, tweets now display the featured image, post title, opening snippet, site attribution, and a link to the original post. Links to items from digital collections now show the image itself (along with some item info), while links to collections, categories, or search results now display a grid of four images with a description underneath. See these examples:

 

A gallery tweet, linking to the homepage for the William Gedney Photographs collection.

A gallery tweet, linking to the homepage for the William Gedney Photographs collection.

Summary Card With Large Image: tweet linking to a post in The Devil's Tale blog.

Summary Card With Large Image: Tweet linking to a post in The Devil’s Tale blog.

Summary Card With Large Image: tweet linking to a digital collections image.

Summary Card With Large Image: tweet linking to a digital collections image.

 

Why This Matters

In 2013-14, social media platforms accounted for 10.1% of traffic to our blogs (~28,000 visits in 2013-14, 11,300 via Twitter), and 4.3% of visits to our digital collections (~17,000 visits, 1,000 via Twitter). That seems low, but perhaps it’s because of the mystery link phenomenon. These new media-rich tweets have the potential to increase our traffic through these channels by being more interesting to look at and more compelling to click.  We’re looking forward to finding out whether they do.

And regardless of driving clicks, there are two other benefits of Twitter Cards that we really care about in the library: context and attribution. We love it when our collections and blog posts are shared on Twitter. These tweets now automatically give some additional information and helpfully cite the source.

How to Get Your Own Twitter Cards

The Manual Way

If you’re manually adding tags like we’ve done in our Digital Collections templates, you can “View Source” on any of our pages to see what <meta> tags make the magic happen. Moz also has some useful code snippets to copy, with links to validator tools so you can make sure you’re doing it correctly.

Gallery Page

Twitter Card metadata for a Gallery Page (Broadsides & Ephemera Collection)

WordPress

Since our blogs run on WordPress, we were able to use the excellent WordPress SEO plugin by Yoast. It’s helpful for a lot of things related to search engine optimization, and it makes this social media optimization easy, too.

Adding Twitter Card metadata with the WordPress SEO plugin.

Adding Twitter Card metadata with the WordPress SEO plugin.

Once your tags are in place, you just need to validate an example from your domain using the Twitter Card Validator before Twitter will turn on the media-rich tweets. It doesn’t take long at all: ours began appearing within a couple hours. The cards apply retroactively to previous tweets, too.

Related Work

Our addition of Twitter Card data follows similar work we have done using semantic markup in our Digital Collections site using the Open Graph and Schema.org vocabularies. Open Graph is a standard developed by Facebook. Similar to Twitter Card metadata, OG tags inform Facebook what content to highlight from a linked webpage. Schema.org is a vocabulary for describing the contents of web pages in a way that is helpful for retrieval and representation in Google and other search engines.

All of these tools use RDFa syntax, a key cornerstone of Linked Data on the web that supports the description of resources using whichever vocabularies you choose. Google, Twitter, Facebook, and other major players in our information ecosystem are now actively using this data, providing clear incentive for web authors to provide it. We should keep striving to play along.

 

Back in February 2014, we wrapped up the CCC project, a collaborative three year IMLS-funded digitization initiative with our partners in the Triangle Research Libraries Network (TRLN). The full title of the project is a mouthful, but it captures its essence: “Content, Context, and Capacity: A Collaborative Large-Scale Digitization Project on the Long Civil Rights Movement in North Carolina.”

Together, the four university libraries (Duke, NC State, UNC-Chapel Hill, NC Central) digitized over 360,000 documents from thirty-eight collections of manuscripts relevant to the project theme. About 66,000 were from our David M. Rubenstein Rare Book & Manuscript Library collections.

Large-Scale

So how large is “large-scale”? By comparison, when the project kicked off in summer 2011, we had a grand total of 57,000 digitized objects available online (“published”), collectively accumulated through sixteen years of digitization projects. That number was 69,000 by the time we began publishing CCC manuscripts in June 2012. Putting just as many documents online in three years as we’d been able to do in the previous sixteen naturally requires a much different approach to creating digital collections.

Traditional Digitization Large-Scale Digitization
Individual items identified during scanning No item-level identification: entire folders scanned
Descriptive metadata applied to each item Archival description only (e.g., at the folder level)
Robust portals for search & browse Finding aid / collection guide as access point

There are some considerable tradeoffs between document availability vs. discovery and access features, but going at this scale speeds publication considerably. Large-scale digitization was new for all four partners, so we benefited by working together.

Digitized documents accessed through an archival finding aid / collection guide with folder-level description.

Project Evaluation

CCC staff completed qualitative and quantitative evaluations of this large-scale digitization approach during the course of the project, ranging from conducting user focus groups and surveys to analyzing the impact on materials prep time and image quality control. Researcher assessments targeted three distinct user groups: 1) Faculty & History Scholars; 2) Undergraduate Students (in research courses at UNC & NC State); 3) NC Secondary Educators.

Here are some of the more interesting findings (consult the full reports for details):

  • Ease of Use. Faculty and scholars, for the most part, found it easy to use digitized content presented this way. Undergraduates were more ambivalent, and secondary educators had the most difficulty.
  • To Embed or Not to Embed. In 2012, Duke was the only library presenting the image thumbnails embedded directly within finding aids and a lightbox-style image navigator. Undergrads who used Duke’s interface found it easier to use than UNC or NC Central’s, and Duke’s collections had a higher rate of images viewed per folder than the other partners. UNC & NC Central’s interfaces now use a similar convention.
  • Potential for Use. Most users surveyed said they could indeed imagine themselves using digitized collections presented in this way in the course of their research. However, the approach falls short in meeting key needs for secondary educators’ use of primary sources in their classes.
  • Desired Enhancements. The top two most desired features by faculty/scholars and undergrads alike were 1) the ability to search the text of the documents (OCR), and 2) the ability to explore by topic, date, document type (i.e., things enabled by item-level metadata). PDF download was also a popular pick.

 

Impact on Duke Digitization Projects

Since the moment we began putting our CCC manuscripts online (June 2012), we’ve completed the eight CCC collections using this large-scale strategy, and an additional eight manuscript collections outside of CCC using the same approach. We have now cumulatively put more digital objects online using the large-scale method (96,000) than we have via traditional means (75,000). But in that time, we have also completed eleven digitization projects with traditional item-level identification and description.

We see the large-scale model for digitization as complementary to our existing practices: a technique we can use to meet the publication needs of some projects.

Usage

Do people actually use the collections when presented in this way? Some interesting figures:

  • Views / item in 2013-14 (traditional digital object; item-level description): 13.2
  • Views / item in 2013-14 (digitized image within finding aid; folder-level description): 1.0
  • Views / folder in 2013-14 (digitized folder view in finding aid): 8.5

It’s hard to attribute the usage disparity entirely to the publication method (they’re different collections, for one). But it’s reasonable to deduce (and unsurprising) that bypassing item-level description generally results in less traffic per item.

On the other hand, one of our CCC collections (The Allen Building Takeover Collection) has indeed seen heavy use–so much, in fact, that nearly 90% of TRLN’s CCC items viewed in the final six months of the project were from Duke. Its images averaged over 78 views apiece in the past year; its eighteen folders opened 363 times apiece. Why? The publication of this collection coincided with an on-campus exhibit. And it was incorporated into multiple courses at Duke for assignments to write using primary sources.

The takeaway is, sometimes having interesting, important, and timely content available for use online is more important than the features enabled or the process by which it all gets there.

Looking Ahead

We’ll keep pushing ahead with evolving our practices for putting digitized materials online. We’ve introduced many recent enhancements, like fulltext searching, a document viewer, and embedded HTML5 video. Inspired by the CCC project, we’ll continue to enhance our finding aids to provide access to digitized objects inline for context (e.g., The Jazz Loft Project Records). Our TRLN partners have also made excellent upgrades to the interfaces to their CCC collections (e.g., at UNC, at NC State) and we plan, as usual, to learn from them as we go.