Category Archives: User Experience

You’re going to lose: The inherent complexity, and near impossibility, of developing for digital collections

 

“Nobody likes you. Everybody hates you. You’re going to lose. Smile, you f*#~.”

Joe Hallenbeck, The Last Boy Scout

Screen Shot 2015-04-01 at 12.17.56 PMWhile I’m glad not to be living in a Tony Scott movie, on occasion I feel like Bruce Willis’ character near the beginning of “The Last Boy Scout.” Just look at some of the things they say about us.

Current online interfaces to primary source materials do not fully meet the needs of even experienced researchers. (DeRidder and Matheny)

The criticism, it cuts deep. But at least they were trying to be gentle, unlike this author:

[I]n use, more often than not, digital library users and digital libraries are in an adversarial position. (Saracevic, p. 9)

That’s gonna leave a mark. Still, it’s the little shots they take, the sidelong jabs, that hurt the most:

The anxiety over “missing something” was quite common across interviews, and historians often attributed this to the lack of comprehensive search tools for primary sources. (Rumer and Schonfeld, p. 16)

Screen Shot 2015-04-03 at 10.57.02 AM
Item types in Tripod2.

I’m fond of saying that the youtube developers have it easy. They support one content type – and until recently, it was Flash, for pete’s sake – minimal metadata, and then what? Comments? Links to some other videos? Wow, that’s complicated.

By contrast, we’ve developed for no less than fifteen different item types during the life of Tripod2, the platform that we’ve used to provide discovery and access for Duke Digital Collections since March 2011. You want a challenge? Try building an interface for flippable anatomical fugitive sheets.  It’s one thing to create a feature allowing users to embed videos from a flat web-site structure; it’s quite another to allow it from a site loaded with heterogeneous content types, then extend it to include items nested within multiple levels of description in finding aids (for an example, see the “Southwest Georgia Voters Project” item here).

I think the problem set of developing tools for digitized primary sources is one of the most interesting areas in the field of librarianship, and for the digital collections team, it’s one of our favorite areas of work. However, the quotes that open this post (the ones not delivered by Bruce Willis, anyway) are part of a literature that finds significant disparity between the needs of the researchers who form our primary audience and the tools that we – collectively speaking, in the field of digital libraries – have built.

Our team has just begun work on our next-generation platform for digital collections, which we call Tripod3. It will be built on the Fedora/Hydra framework that our Digital Repository Services team is using to develop the Duke Digital Repository. As the project manager, I’m trying to catch up on the recent literature of assessment for digital collections, and consider how we can improve on what we’ve done in the past. It’s one of the main ways  we can engage with researchers, as I wrote about in a previous post.

One of the issues we need to address is the problem of archival context. It’s something that the users of digitized primary sources cite again and again in the studies I’ve read. It manifests itself in a few ways, and could be the subject of a lengthier piece, but I think Chassanoff gives a good sense of it in her study (pp. 470-1):

Overall, findings suggest that historians seem to feel most comfortable using digitized sources when an online environment replicates essential attributes found in archives. Materials should be obtained from a reputable repository, and the online finding aid should provide detailed description. Historians want to be able to access the entire collection online and obtain any needed information about an item’s provenance. Indeed, the possibility that certain materials are omitted from an online collection appears to be more of a concern than it is in person at an archives.

The idea of archival context poses what I think is the central design problem of digital collections. It’s a particular challenge because, while it’s clear that researchers want and require the ability to see an object in its archival context, they also don’t want it. By which I mean, they also want to be able to find everything in the same flat context that everything assumes with a retrieval service like Google.

Archival context implies hierarchy, using the arrangement of the physical materials to order the digital. We were supposed to have broken away from the tyranny of physical arrangement years ago. David Weinberger’s Everything is Miscellaneous trumpeted this change in 2007, and while we had already internalized what he called the “third order of order” by then, it is the unambiguous way of the world now.

With our Tripod2 platform, we built both a shallow “digital collections miscellany” interface at http://library.duke.edu/digitalcollections/, but later started embedding items directly in finding aids.  Examples of the latter include the Jazz Loft Project Records and the Alexander Stephens Papers. What we never did was integrate these two modes of publication for digitized primary sources. Items from finding aids do not appear in search results for the main digital collections site, and items on the main site do not generally link back to the finding aid for their parent collection, and not to the series in which they’re arranged.

While I might give us a passing grade for the subject of “Providing archival context,” it wouldn’t be high enough to get us into, say, Duke. I expect this problem to be at the center of our work on the next-generation platform.


Sources

 

Alexandra Chassanoff, “Historians and the Use of Primary Materials in the Digital Age,” The American Archivist 76, no. 2, 458-480.

Jody L. DeRidder and Kathryn G. Matheny, “What Do Researchers Need? Feedback On Use of Online Primary Source Materials,” D-Lib Magazine 20, no. 7/8, available at http://www.dlib.org/dlib/july14/deridder/07deridder.html

Jennifer Rumer and Roger C. Schonfeld, “Supporting the Changing Research Practices of Historians: Final Report from ITHAKA S+R,” (2012), http://www.sr.ithaka.org/sites/default/files /reports/supporting-the-changing-research-practices-of-historians.pdf.

Tefko Saracevic, “How Were Digital Libraries Evaluated?”, paper first presented at the DELOS WP7 Workshop on the Evaluation of Digital Libraries (2004), available at http://www.scils.rutgers. edu/~tefko/DL_evaluation_LIDA.pdf

Building a Kiosk for the Edge

Many months ago I learned that a new space, The Ruppert Commons for Research, Technology, and Collaboration, was going to be opening at the start of the calendar year. I was tasked with building an informational kiosk that would be seated in the entry area of the space. The schedule was a bit hectic and we ended up pruning some of the desired features, but in the end I think our first iteration has been working well. So, I wanted to share the steps I took to build it.

Setting Requirements

I first met with the Edge team at the end of August 2014. They had an initial ‘wish list’ of features that they wanted to be included in the kiosk. We went through the list and talked about the feasibility of those items, and tried to rank their importance. Our final features list looked something like this:

Primary Features:

  • Events list (both public and private events in the space)
  • Room reservation system
  • Interactive floor plan map
  • Staff lookup
  • Current Time
  • Contact information (chat, email, phone)

Secondary Features:

  • Display of computer availability
  • Ability to report printing / scanning problems
  • Book locations
  • Scheduleable content on ‘home’ screen

Our deadline was the soft opening date of the space at the start of the new year, but with the approaching holidays (and other projects competing for time) this was going to be a pretty fast turn around. My goal was to have a functional prototype ready for feedback by mid October. I really didn’t start working on the UI side of things until early that month, so I ended up needing to kick that can down the road a few weeks, but that happens some times.

The Hardware

The Library had purchased two Dell 27″ XPS all-in-one touchscreen machines for the purpose of serving as an informational kiosk near the new/temporary main entrance of Perkins/Bostock. For various reasons, that project kept getting postponed. But with the desire to also have a kiosk in the Edge, we decided we could use one of the Dell machines for this purpose. The touch screen display is great —  very bright, reasonably accurate color reproduction, and responsive to touch inputs. It does pickup a lot of finger prints, but that’s sort of unavoidable with a glossy display. The machine seems to run a little bit hot and the fan is far from silent, but in the space you don’t notice it at all. My favorite aspect of this computer is the stand. It’s really fantastic — it’s super easy to adjust, but also very sturdy. You can position it in a variety of ways, depending on the space you’re using it in, and be confident that it won’t slip out of adjustment even under constant use. Various positions of Dell computer I think in general we’re a little wary of using consumer grade hardware in a 24/7 public environment, but for the 1.5 months it’s been deployed it seems to be holding up well enough.

The OS

The Dell XPS came from the factory with Windows 8. I was really curious about using Assigned Access Mode in the Windows 8.1, but the need to use a local (non-domain) account necessitated a clean install of 8.1, which sounds annoying, but that process is so fast and effortless, at least compared to days of Windows yore, that it wasn’t a huge deal. I eventually configured the system as desired — it auto-boots into the local account on startup and then fires up the assigned Windows app (and limits the machine only to that app).

I spent some time playing around with different approaches for a browser to use with assigned access. The goal was to have a browser that ran in a ‘kiosk’ mode in that there was no ability for the user to interact with anything outside of the intended kiosk UI — meaning, no browser chrome windows, bookmarks, etc. I also planned to use Microsoft’s Family Safety controls to limit access to URLs outside of the range of pages that would comprise the kiosk UI. I tried both Google Chrome and Microsoft IE 11 (which really is a good browser, despite pervasive IE hate), but I ended up having trouble with both of them in different ways. Eventually, I stumbled on to a free Windows Store app called KIOSK SP Browser. It does exactly what I want — it’s a simple, stripped down, full screen browser app. It also has some specific kiosk features (like timeout detection) but I’m only using it to load the kiosk homepage on startup.

The Backend

As several of the requirements necessitated data sources that live in the Drupal system that drives our main library site, I figured the path of least resistance would be to also build the kiosk interface in Drupal. Using the Delta module, I setup a version of our theme that stripped out most of the elements that we wouldn’t be using (header, footer, etc.) for the kiosk. I could then apply the delta to a small range of pages using the Context Module. The pages themselves are quite simple by and large. Screen shots of the pages in the Edge Kiosk

  • Events — I used a View to import an RSS feed from Yahoo Pipes (which combines events from our own Library system and the larger Duke system).
  • Reserve Spaces – this page loads in content from Springshare’s LibCal system using an iFrame.
  • Map — I drew a simplified map in Illustrator based architect’s floor plan , then saved it out as an SVG and added ID tags to the areas I wanted to make interactive.
  • Staff — this page loads in content from a google spreadsheet using a technique I outlined previously on Bitstreams.
  • Help — this page loads our LibraryH3LP Chat Widget and a Qualtrics email form.

The Frontend

When it comes time to design an interface, my first step is almost always to sketch on paper. For this project, I did some playing around and ended up settling on a circular motif for the main navigational interface. I based the color scheme and typography on a branding and style guide that was developed for the Edge. Edge Kiosk home page design Many years ago I used to turn my sketches into high fidelity mockups in photoshop or illustrator, but for the past couple of years I’ve tended to just dive right in and design on the fly with html/css. I created a special stylesheet just for this kiosk — it’s based on a fixed pixel layout as it is only ever intended to be used on that single Dell computer — and also assigned it to load using Delta. One important aspect of a kiosk is providing some hinting to users that they can indeed interact with it. In my experience, this is usually handled in the form of an attract loop.

I created a very simple motion design using my favorite NLE and rendered out an mp4 to use with the kiosk. I then setup the home page to show the video when it first loads and to hide it when the screen is touched. This helps the actual home page content appear to load very quickly (as it’s actually sitting beneath the video). I also included a script on every page to go to the homepage after a preset period on inactivity. It’s currently set to three minutes, but we may tweak that. Video stills of attract loop All in all I’m pleased with how things turned out. We’re planning to spend some time evaluating the usage of the kiosk over the next couple of months and then make any necessary tweaks to improve user experience. Swing by the Edge some time and try it out!

Indiana Jones and The Greek Manuscripts

One of my favorite movies as a youngster was Steven Spielberg’s “Raiders of the Lost Ark.” It’s non-stop action as the adventurous Indiana Jones criss-crosses the globe in an exciting yet dangerous race against the Nazis for possession of the Ark of the Covenant. According to the Book of Exodus, the Ark is a golden chest which contains the original stone tablets on which the Ten Commandments are inscribed, the moral foundation for both Judiasm and Christianity. The Ark is so powerful that it single-handedly destroys the Nazis and then turns Steven Spielberg and Harrison Ford into billionaires. Countless sequels, TV shows, theme-park rides and merchandise follow.

emsgk010940010
Greek manuscript 94, binding consists of heavily decorated repoussé silver over leather.

Fast-forward several decades, and I am asked to digitize Duke Libraries’ Kenneth Willis Clark Collection of Greek Manuscripts. Although not quite as old as the Ten Commandments, this is an amazing collection of biblical texts dating all the way back to the 9th century. These are weighty volumes, hand-written using ancient inks, often on animal-skin parchment. The bindings are characterized as Byzantine, and often covered in leathers like goatskin, sometimes with additional metal ornamentation. Although I have not had to run from giant boulders, or navigate a pit of snakes, I do feel a bit like Indiana Jones when holding one of these rare, ancient texts in my hands. I’m sure one of these books must house a secret code that can bestow fame and fortune, in addition to the obvious eternal salvation.

Before digitization, Senior Conservator Erin Hammeke evaluates the condition of each Greek manuscript, and rules out any that are deemed too fragile to digitize. Some are considered sturdy enough, but still need repairs, so Erin makes the necessary fixes. Once a manuscript is given the green light for digitization, I carefully place it in our book cradle so that it cannot be opened beyond a 90-degree angle. This helps protect our fragile bound materials from unnecessary stress on the binding. Next, the aperture, exposure, and focus are carefully adjusted on our Phase One P65+ digital camera so that the numerical values of our X-rite color calibration target, placed on top of the manuscript, match the numerical readings shown on our calibrated monitors.

cradle
Greek manuscript 101, with X-Rite color calibration target, secured in book cradle.

As the photography begins, each page of the manuscript is carefully turned by hand, so that a new image can be made of the following page. This is a tedious process, but requires careful concentration so the pages are consistently captured throughout each volume. Right-hand (recto) pages are captured first, in succession. Then the volume is turned over, so that the left-hand (verso) pages can be captured. I can’t read Greek, but it’s fascinating to see the beauty of the calligraphy, and view the occasional illustrations that appear on some pages. Sometimes, I discover that moths, beetles or termites have bored through the pages over time. It’s interesting to speculate as to which century this invasive destruction may have occurred. Perhaps the Nazis from the Indiana Jones movies traveled back in time, and placed the insects there?

worm2
Greek manuscript 101, showing insect damage.

Once the photography is complete, the recto and verso images are processed and then interleaved to recreate the left-right page order of the original manuscript. Next, the images go through a quality-control process in which any extraneous background area is cropped out, and each page is checked for clarity and consistent color and illumination. After that, another round of quality control insures that no pages are missing, or out of order. Finally, the images are converted to Pyramid TIFF files, which allow our web site users to zoom out and see all the pages at once, or zoom in to see maximum detail of any selected page. 38 Greek manuscripts are ready for online viewing now, and many more are coming soon. Stay tuned for the exciting sequel: “Indiana Jones and Even More Greek Manuscripts.”

Embeds, Math & Beyond

This week, in conjunction with our H. Lee Waters Film Collection unveiling, we rolled out a handy new Embed feature for digital collections items.  The idea is to make it as easy as possible for someone to share their discoveries from our collections, with proper attribution, on other websites or blogs.

How To

It’s simple, really, and mimics the experience you’re likely to encounter getting embed code from other popular sites with videos, images, and the like. We modeled our approach loosely on the Internet Archive‘s video embed service (e.g., visit this video and click the Share icon, but only if you are unafraid of clowns).

Embed Link

Click the “Embed” link under an item from Duke Digital Collections, and copy the snippet of code that pops up. Paste it in your website, and you’re done!

Examples

I’ll paste a few examples below using different kinds of items. The embed code is short and nearly identical for all of these:

A Single Image

Paginated Item

A Video

Single-Track Audio

Multi-Track Audio

Document with Document Viewer

Technical Considerations

Building this feature required a little bit of math, some trial & error, and a few tricks. The steps were to:

  • Set up a service to return customized item pages at the path http://library.duke.edu/digitalcollections/embed/<itemid>/
  • Use CSS & JS to make the media as fluid as possible to fill whatever space it ends up in
  • Use a fixed height and overflow: auto on the attribution box so longer content will scroll
  • Use link rel=”canonical” to ensure the item’s embed page is associated with the real item page (especially to improve links / ranking signals for search engines).
  • Present the user a copyable HTML <iframe> element in the regular item page that has the correct height & width attributes to accommodate the item(s) to be embedded

This last point is where the math comes in. Take a single image item, for example. With a landscape-orientation image we need to give the user a different <iframe> height to copy than we would for a portrait. It gets even more complicated when we have to account for multiple tracks of audio or video, or combinations of the two.

Coming Soon

We’ll refine this feature a bit in the coming weeks, and work out any embed-bugs we discover. We’ll also be developing a similar feature for embedding digitized content found in our archival collection guides.

Having it “All” – About Library Search Results

This fall we changed the default tool that students and faculty use to research library holdings. We have tools that work well for a broad search and tools that are tailored for more specialized research. So, how is this change working out?

Word cloud depicting the 30 most frequently used search terms. The size of the text is proportional to the number of times the term has been used.
Word cloud depicting the 30 most frequently used search terms. The size of the text is proportional to the number of times the term has been used.

We’ve got numbers and we’ve got opinion. First, let’s look at the numbers.

  • The most used feature on the Duke Libraries website is the search box on the homepage with 211,655 searches performed using the default “All” tab between August 25 and November 16, 2014.
  • Within the “All” tab search results, patrons selected results from Articles 48% of the time, results from Books & Media 44% of the time and other results 8% of the time. These results were presented side-by-side on a single results page.
  • The All search isn’t the only option on our homepage as the Books & Media tab was used 68,566 times and the Articles tab was used 46,028 times during the same timeframe.
  • The five most used search terms were PubMed, Web of Science, JSTOR, RefWorks, and Dictionary of National Biography.
  • The most frequently searched for fictional character was Tom Sawyer.
  • The most searched for person was Dr. Martin Luther King, Jr.

So what thoughts have you shared with us about the search options we provide?

On the Libraries' homepage, you can click the gear icon to choose a different search tab as your customized default.
On the Libraries’ homepage, you can click the gear icon to choose a different search tab as your customized default.
  • During the first four weeks of the semester, 48 people submitted their opinions through a survey linked from the search results page.
  • Thirty percent of survey respondents said that they liked having the Articles and Books & Media results appear side by side on the new search results page.
  • Twenty-seven percent said they thought the page looked cluttered or that it was hard to read.
  • Forty percent said that they did not know you can change the default search tab that appears when you view the Duke Libraries’ homepage.
  • Twenty-five percent said that they did not know that they can choose a more highly-focused search option from the Search & Find menu.
  • Testing with a small group of researchers revealed that it was difficult to locate material from the Rubenstein Library using our default search results screen.

Based on your feedback, we made the following improvements to the search results page during the semester.

  • We de-cluttered the information shown in the Articles and Books & Media columns to make results easier to read.
  • We moved “Our Website” results to the top of the right column.
  • We reduced the space used by the search box on the results page.

In the coming months, we will explore ways of making it easier to find materials from the Rubenstein Library and from University Archives. We are also investigating options for implementing a Best Bets feature on the results page; this would provide clearer access to some of the most used resources.

What can you do to help?

Complete our online survey and tell us what you think about the search tools provided through the Libraries’ homepage.

Anatomy of an Exhibit Kiosk

I’ve had the pleasure of working on several exhibit kiosks during my time at the library. Most of them have been simple in their functionality, but we’re hoping to push some boundaries and get more creative in the future. Most recently, I’ve been working on building a kiosk for the Queering Duke History: Understanding the LGBTQ Experience at Duke and Beyond exhibit. It highlights oral history interviews with six former Duke students. This particular kiosk example isn’t very complicated, but I thought it would be fun to outline how it’s put together.

Screen shot of the 'attract' loop
Screen shot of the ‘attract’ loop

Hardware

Most of our exhibits run on one of two late 2009 27″ iMacs that we have at our disposal. The displays are high-res (1920×1080) and vivid, the built-in speakers sound fine, and the processors are strong enough to display multimedia content without any trouble. Sometimes we use the kiosk machines to loop video content, so there’s no user interaction required. With this latest iteration, as users will be able to select audio files for playback, we’ll need to provide a mouse. We do our best to secure them to our kiosk stand, and in my tenure we’ve not had any problems. But I understand in the past that sometimes input devices have been damaged or gone missing. As we migrate to touch-screen machines in the future these sorts of issues won’t be a problem.

Software

We tend to leave our kiosk machines out in the open in public spaces. If the machine isn’t sufficiently locked down, it can lead to it being used for purposes other than what we have in mind. Our approach is to setup a user account that has very narrow privileges and set it as the default login (so when the machine starts up it boots into our ‘kiosk’ account). In OS X you can setup user permissions, startup programs, and other settings via ‘Users and Groups’ in the System Preferences. We also setup power saving settings so that the computer will sleep between midnight and 6:00am using the Energy Saving Scheduler.

My general approach for interactive content is to build web pages, host them externally, and load them on to the kiosk in a web browser. I think the biggest benefits of this approach are that we can make updates without having to take down the kiosk and also track user interactions using Google analytics. However, there are drawbacks as well. We need to ensure that we have reliable network connectivity, which can be a challenge sometimes. By placing the machine online, we also add to the risk that it can be used for purposes other than what we intend. So in order to lock things down even more, we utilize xStand to display our interactive content. It allows for full screen browsing without any GUI chrome, black-listing and/or white-listing sites, and most importantly, it restarts automatically after a crash. In my experience it’s worked very well.

User Interface

This particular exhibit kiosk has only one real mission – to enable users to listen to a series of audio clips. As such, the UI is very simple. The first component is a looping ‘attract’ screen. The attract screen serves the dual purpose of drawing attention to the kiosk and keeping pixels from getting burned in on the display. For this kiosk I’m looping a short mp4 video file. The video container is wrapped in a link and when it’s clicked a javscript hides the video and displays the content div.

 

The content area of the page is very simple – there are a group of images that can be clicked on. When they are, a lightbox window (I like Fancy Box) pops up that holds the relevant audio clips. I’m using simple html5 audio playback controls to stream the mp3 files.

Screen shot of the 'home' screen UI
Screen shot of the ‘home’ screen UI
Screen shot of the audio playback UI
Screen shot of the audio playback UI

Finally, there’s another javascript running in the background that detects and user input. After 10 minutes of inactivity, the page reloads which brings back the attract screen.

The Exhibit

Queering Duke History runs through December 14, 2014 in the Perkins Library Gallery on West Campus. Stop by and check it out!

Analog to Digital to Analog: Impact of digital collections on permission-to-publish requests

We’ve written many posts on this blog that describe (in detail) how we build our digital collections at Duke, how we describe them, and how we make them accessible to researchers.

At a Rubenstein Library staff meeting this morning one of my colleagues–Sarah Carrier–gave an interesting report on how some of our researchers are actually using our digital collections. Sarah’s report focused specifically on permission-to-publish requests, that is, cases where researchers requested permission from the library to publish reproductions of materials in our collection in scholarly monographs, journal articles, exhibits, websites, documentaries, and any number of other creative works. To be clear, Sarah examined all of these requests, not just those involving digital collections. Below is a chart showing the distribution of the types of publication uses.

Types of permission-to-publish requests, FY2013-2014
Types of permission-to-publish requests, FY2013-2014

What I found especially interesting about Sarah’s report, though, is that nearly 76% of permission-to-publish requests did involve materials from the Rubenstein that have been digitized and are available in Duke Digital Collections. The chart below shows the Rubenstein collections that generate the highest percentage of requests. Notice that three of these in Duke Digital Collections were responsible for 40% of all permission-to-publish requests:

Collections generating the most permission-to-publish requests, FY2013-2014
Collections generating the most permission-to-publish requests, FY2013-2014

So, even though we’ve only digitized a small fraction of the Rubenstein’s holdings (probably less than 1%), it is this 1% that generates the overwhelming majority of permission-to-publish requests.

I find this stat both encouraging and discouraging at the same time. On one hand, it’s great to see that folks are finding our digital collections and using them in their publications or other creative output. On the other hand, it’s frightening to think that the remainder of our amazing but yet-to-be digitized collections are rarely if ever used in publications, exhibits, and websites.

I’m not suggesting that researchers aren’t using un-digitized materials. They certainly are, in record numbers. More patrons are visiting our reading room than ever before. So how do we explain these numbers? Perhaps research and publication are really two separate processes. Imagine you’ve just written a 400 page monograph on the evolution of popular song in America, you probably just want to sit down at your computer, fire up your web browser, and do a Google Image Search for “historic sheet music” to find some cool images to illustrate your book. Maybe I’m wrong, but if I’m not, we’ve got you covered. After it’s published, send us a hard copy. We’ll add it to the collection and maybe we’ll even digitize it someday.

[Data analysis and charts provided by Sarah Carrier – thanks Sarah!]

Large-Scale Digitization and Lessons from the CCC Project

Back in February 2014, we wrapped up the CCC project, a collaborative three year IMLS-funded digitization initiative with our partners in the Triangle Research Libraries Network (TRLN). The full title of the project is a mouthful, but it captures its essence: “Content, Context, and Capacity: A Collaborative Large-Scale Digitization Project on the Long Civil Rights Movement in North Carolina.”

Together, the four university libraries (Duke, NC State, UNC-Chapel Hill, NC Central) digitized over 360,000 documents from thirty-eight collections of manuscripts relevant to the project theme. About 66,000 were from our David M. Rubenstein Rare Book & Manuscript Library collections.

Large-Scale

So how large is “large-scale”? By comparison, when the project kicked off in summer 2011, we had a grand total of 57,000 digitized objects available online (“published”), collectively accumulated through sixteen years of digitization projects. That number was 69,000 by the time we began publishing CCC manuscripts in June 2012. Putting just as many documents online in three years as we’d been able to do in the previous sixteen naturally requires a much different approach to creating digital collections.

Traditional Digitization Large-Scale Digitization
Individual items identified during scanning No item-level identification: entire folders scanned
Descriptive metadata applied to each item Archival description only (e.g., at the folder level)
Robust portals for search & browse Finding aid / collection guide as access point

There are some considerable tradeoffs between document availability vs. discovery and access features, but going at this scale speeds publication considerably. Large-scale digitization was new for all four partners, so we benefited by working together.

Digitized documents accessed through an archival finding aid / collection guide with folder-level description.

Project Evaluation

CCC staff completed qualitative and quantitative evaluations of this large-scale digitization approach during the course of the project, ranging from conducting user focus groups and surveys to analyzing the impact on materials prep time and image quality control. Researcher assessments targeted three distinct user groups: 1) Faculty & History Scholars; 2) Undergraduate Students (in research courses at UNC & NC State); 3) NC Secondary Educators.

Here are some of the more interesting findings (consult the full reports for details):

  • Ease of Use. Faculty and scholars, for the most part, found it easy to use digitized content presented this way. Undergraduates were more ambivalent, and secondary educators had the most difficulty.
  • To Embed or Not to Embed. In 2012, Duke was the only library presenting the image thumbnails embedded directly within finding aids and a lightbox-style image navigator. Undergrads who used Duke’s interface found it easier to use than UNC or NC Central’s, and Duke’s collections had a higher rate of images viewed per folder than the other partners. UNC & NC Central’s interfaces now use a similar convention.
  • Potential for Use. Most users surveyed said they could indeed imagine themselves using digitized collections presented in this way in the course of their research. However, the approach falls short in meeting key needs for secondary educators’ use of primary sources in their classes.
  • Desired Enhancements. The top two most desired features by faculty/scholars and undergrads alike were 1) the ability to search the text of the documents (OCR), and 2) the ability to explore by topic, date, document type (i.e., things enabled by item-level metadata). PDF download was also a popular pick.

 

Impact on Duke Digitization Projects

Since the moment we began putting our CCC manuscripts online (June 2012), we’ve completed the eight CCC collections using this large-scale strategy, and an additional eight manuscript collections outside of CCC using the same approach. We have now cumulatively put more digital objects online using the large-scale method (96,000) than we have via traditional means (75,000). But in that time, we have also completed eleven digitization projects with traditional item-level identification and description.

We see the large-scale model for digitization as complementary to our existing practices: a technique we can use to meet the publication needs of some projects.

Usage

Do people actually use the collections when presented in this way? Some interesting figures:

  • Views / item in 2013-14 (traditional digital object; item-level description): 13.2
  • Views / item in 2013-14 (digitized image within finding aid; folder-level description): 1.0
  • Views / folder in 2013-14 (digitized folder view in finding aid): 8.5

It’s hard to attribute the usage disparity entirely to the publication method (they’re different collections, for one). But it’s reasonable to deduce (and unsurprising) that bypassing item-level description generally results in less traffic per item.

On the other hand, one of our CCC collections (The Allen Building Takeover Collection) has indeed seen heavy use–so much, in fact, that nearly 90% of TRLN’s CCC items viewed in the final six months of the project were from Duke. Its images averaged over 78 views apiece in the past year; its eighteen folders opened 363 times apiece. Why? The publication of this collection coincided with an on-campus exhibit. And it was incorporated into multiple courses at Duke for assignments to write using primary sources.

The takeaway is, sometimes having interesting, important, and timely content available for use online is more important than the features enabled or the process by which it all gets there.

Looking Ahead

We’ll keep pushing ahead with evolving our practices for putting digitized materials online. We’ve introduced many recent enhancements, like fulltext searching, a document viewer, and embedded HTML5 video. Inspired by the CCC project, we’ll continue to enhance our finding aids to provide access to digitized objects inline for context (e.g., The Jazz Loft Project Records). Our TRLN partners have also made excellent upgrades to the interfaces to their CCC collections (e.g., at UNC, at NC State) and we plan, as usual, to learn from them as we go.

Bento is Coming!

A unified search results page, commonly referred to as the “Bento Box” approach, has been an increasingly popular method to display search results on library websites. This method helps users gain quick access to a limited result set across a variety of information scopes while providing links to the various silos for the full results. NCSU’s QuickSearch implementation has been in place since 2005 and has been extremely influential on the approach taken by other institutions.

Way back in December of 2012, the DUL began investigating and planning for implementing a Bento search results layout on our website. Extensive testing revealed that users favor searching from a single box — as is their typical experience conducting web searches via Google and the like. Like many libraries, we’ve been using Summon as a unified discovery layer for articles, books, and other resources for a few years, providing an ‘All’ tab on our homepage as the entry point. Summon aggregates these various sources into a common index, presented in a single stream on search results pages. Our users often find this presentation overwhelming or confusing and prefer other search tools. As such, we’ve demoted the our ‘All’ search on our homepage — although users can set it as the default thanks to the very slick Default Scope search tool built by Sean Aery (with inspiration from the University of Notre Dame’s Hesburgh Libraries website):

Default Search Tool

The library’s Web Experience Team (WebX) proposed the Bento project in September of 2013. Some justifications for the proposal were as follows:

Bento boxing helps solve these problems:

  • We won’t have to choose which silo should be our default search scope (in our homepage or masthead)
  • Synthesizing relevance ranking across very different resources is extremely challenging, e.g., articles get in the way of books if you’re just looking for books (and vice-versa).
  • We need to move from “full collection discovery to full library discovery” – in the same search, users discover expertise, guides/experts, other library provisions alongside items from the collections. 1
  • “A single search box communicates confidence to users that our search tools can meet their information needs from a single point of entry.” 2

Citations:

  1. Thirteen Ways of Looking at Libraries, Discovery, and the Catalog by Lorcan Dempsey.
  2. How Users Search the Library from a Single Search Box by Cory Lown, Tito Sierra, and Josh Boyer

Sean also developed this mockup of what Bento results could look like on our website and we’ve been using it as the model for our project going forward:

Bento Mockup

For the past month our Bento project team has been actively developing our own implementation. We have had the great luxury of building upon work that was already done by brilliant developers at our sister institutions (NCSU and UNC) — and particular thanks goes out to Tim Shearer at UNC Libraries who provided us with the code that they are using on their Bento results page, which in turn was heavily influenced by the work done at NCSU Libraries.

Our approach includes using results from Summon, Endeca, Springshare, and Google. We’re building this as a Drupal module which will make it easy to integrate into our site. We’re also hosting the code on GitHub so others can gain from what we’ve learned — and to help make our future enhancements to the module even easier to implement.

Our plan is to roll out Bento search in August, so stay tuned for the official launch announcement!

 


PS — as the 4th of July holiday is right around the corner, here are some interesting items from our digital collections related to independence day: