Category Archives: Technology

On Tour with H. Lee Waters: Visualizing a Logbook with TimeMapper

The H. Lee Waters Film Collection we published earlier this month has generated quite a buzz. In the last few weeks, we’ve seen a tremendous uptick in visits to Duke Digital Collections and received comments, mail, and phone calls from Waters fans, film buffs, and from residents of the small towns he visited and filmed over 70 years ago. It’s clear that Waters’ “Movies of Local People” have wide appeal.

The 92 films in the collection are clearly the highlight, but as an archivist and metadata librarian I’m just as fascinated by the logbooks Waters kept as he toured across the Carolinas, Virginia, and Tennessee screening his films in small town theaters between 1936 and 1942. In the logbooks, Waters typically recorded the theater name and location where he screened each film, what movie-goers were charged, his percentage of the profits, his revenue from advertising, and sometimes the amount and type of footage shown.

As images in the digital collection, the logbooks aren’t that interesting (at least visually), but the data they contain tell a compelling story. To bring the logbooks to life, I decided to give structure to some of the data (yes, a spreadsheet) and used a new visualization tool I recently discovered called TimeMapper to plot Waters’ itinerary on a synchronized timeline and map–call it a timemap! You can interact with the embedded timemap below, or see a full-screen version here. Currently, the Waters timemap only includes data from the first 15 pages of the logbook (more to come!). Already, though, we can start to visualize Waters’ route and the frequency of film screenings.  We can also interact with the digital collection in new ways:

  • Click on a town in the map view to see when Waters’ visited and then view the logbook entry or any available films for that town.
  • Slide the timeline and click through the entries to trace Waters’ route
  • Toggle forward or backwards through the logbook entries to travel along with Waters

For me, the Waters timemap demonstrates the potential for making use of the data in our collections, not just the digitized images or artifacts. With so many simple and freely available tools like TimeMapper and Google Fusion Tables (see my previous post), it has never been so easy to create interactive visualizations quickly and with limited technical skills.

I’d love to see someone explore the financial data in Waters’ logbooks to see what we might learn about his accounting practices or even about the economic conditions in each town. The logbook data has the potential to support any number of research questions. So start your own spreadsheet and have at it!

[Thanks to the folks at Open Knowledge Labs for developing TimeMapper]

When it Rains, It Pours: A Digital Collections News Round Up

2015 has been a banner year for Duke Digital Collections, and its only January! We have already published a new collection, broken records and expanded our audience. Truth be told, we have been on quite a roll for the last several months, and with the holidays we haven’t had a chance to share every new digital collection with you. Today on Bitstreams, we highlight digital collection news that didn’t quite make the headlines in the past few months.

H. Lee Watersmania

waterschart
Compare normal Digital Collections traffic to our Waters spike on Monday January 19th.

Before touching on news you haven’t about, we must continue the H. Lee Waters PR Blitz. Last week, we launched the H. Lee Waters digital collection. We and the Rubenstein Library knew there was a fair amount of pent-up demand for this collection, however we have been amazed by the reaction of the public. Within a few days of launch, site visits hit what we believe (though cannot say with 100% certainty) to be an all time high of 17,000 visits and 37,000 pageviews on Jan 19.  We even suspect that the intensity of the traffic has contributed to some recent server performance issues (apologies if you have had trouble viewing the films – we and campus IT are working on it).

We have also seen more than 20 new user comments left on Water’s films pages, 6 comments left on the launch blog post, and 40+ new likes on the Duke Digital Collections Facebook page since last week. The Rubenstein Library has also received a surge of inquiries about the collection. These may not be “official” stats, but we have never seen this much direct public reaction to one of our new digital collections, and we could not be more excited about it.

Early Greek Manuscripts

An example from the early Greek Manuscript collection.
An example from the early Greek Manuscript collection.

In November we quietly made 38 early Greek manuscripts available online, one of which is the digital copy of a manuscript since returned to the Greek government.  These beautiful volumes are part of the Rubenstein Library and date from the 9th – 17th centuries.   We are still digitizing volumes from this collection, and hope to publish more in the late Spring.  At that time we will make some changes to the look and feel of the digital collection.  Our goal will be to further expose the general public to the beauty of these volumes while also increasing discoverability to multiple scholarly communities.

 

Link Media Wall Exhibit

In early January, the libraries Digital Exhibits Working Group premiered their West Campus Construction Link media wall exhibit, affectionately nicknamed the Game of Stones.   The exhibit features content from the Construction of Duke University digital collection and the Duke University Archives’ Flickr sets.   The creation of this exhibit has been described previously on Bitstreams (here and here).  Head on down to the link and see it for yourself!campus_constr

 

History of Medicine Artifacts

Medicine bottles and glasses from the HOM artifacts collection.

Curious about bone saws, blood letting or other historic medical instruments? Look no further than the Rubenstein Libraries History of Medicine Artifact’s Collection Guide.   In December we published over 300 images of historic medical artifacts embedded in the collection guide.  Its an incredible and sometimes frightening treasure trove of images.

These are legacy images taken  by the History of Medicine.  While we didn’t shoot these items in the Digital Production Center, the digital collections team still took a hands on approach to normalizing the filenames and overall structure of the image set so we could publish them.  This project was part of our larger efforts to make more media types embeddable in Rubenstein collection guides, a deceptively difficult process that will likely be covered more in depth in a future Bitstreams post.

Digitization to Support the Student Nonviolent Coordinating Committee (SNCC) Legacy Project Partnership

Transcript from an oral history in the Joseph Sinsheimer papers.
Transcript from an oral history in the Joseph Sinsheimer papers.

In the last year, Duke University Libraries has been partnering with the SNCC Legacy Project and the Center for Documentary Studies on One Person One Vote: The Legacy of SNCC and the Fight for Voting Rights.  As part of the project, the digital collections team has digitized several collections related to SNCC and made content available from each collections’ collection guide.  The collections include audio recordings, moving images and still images.  Selections from the digitized content will soon be made available on the One Person One Vote site to be launched in March 2015.  In the meantime, you can visit the collections directly:  Joseph Sinsheimer PapersFaith Holsaert Papers, and SNCC 40th Anniversary Conference.

 

Coach 1K

Coach K’s first Duke win against Stetson.

This one is hot off the digital presses.  Digital Collections partnered with University Archives to publish Coach K’s very first win at Duke just this week in anticipation of victory # 1000.

What’s Next for Duke Digital Collections?

The short answer is, a lot!  We have very ambitious plans for 2015.  We will be developing the next version of our digital collections platform, hiring an intern (thank you University Archives), restarting digitization of the Gedney collection, and of course publishing more of your favorite digital collections.   Stay tuned!

Embeds, Math & Beyond

This week, in conjunction with our H. Lee Waters Film Collection unveiling, we rolled out a handy new Embed feature for digital collections items.  The idea is to make it as easy as possible for someone to share their discoveries from our collections, with proper attribution, on other websites or blogs.

How To

It’s simple, really, and mimics the experience you’re likely to encounter getting embed code from other popular sites with videos, images, and the like. We modeled our approach loosely on the Internet Archive‘s video embed service (e.g., visit this video and click the Share icon, but only if you are unafraid of clowns).

Embed Link

Click the “Embed” link under an item from Duke Digital Collections, and copy the snippet of code that pops up. Paste it in your website, and you’re done!

Examples

I’ll paste a few examples below using different kinds of items. The embed code is short and nearly identical for all of these:

A Single Image

Paginated Item

A Video

Single-Track Audio

Multi-Track Audio

Document with Document Viewer

Technical Considerations

Building this feature required a little bit of math, some trial & error, and a few tricks. The steps were to:

  • Set up a service to return customized item pages at the path http://library.duke.edu/digitalcollections/embed/<itemid>/
  • Use CSS & JS to make the media as fluid as possible to fill whatever space it ends up in
  • Use a fixed height and overflow: auto on the attribution box so longer content will scroll
  • Use link rel=”canonical” to ensure the item’s embed page is associated with the real item page (especially to improve links / ranking signals for search engines).
  • Present the user a copyable HTML <iframe> element in the regular item page that has the correct height & width attributes to accommodate the item(s) to be embedded

This last point is where the math comes in. Take a single image item, for example. With a landscape-orientation image we need to give the user a different <iframe> height to copy than we would for a portrait. It gets even more complicated when we have to account for multiple tracks of audio or video, or combinations of the two.

Coming Soon

We’ll refine this feature a bit in the coming weeks, and work out any embed-bugs we discover. We’ll also be developing a similar feature for embedding digitized content found in our archival collection guides.

Winter Cross-Training in the DPC

The Digital Production Center engages with various departments within the Libraries and across campus to preserve endangered media and create unique digital collections. We work especially closely with The Rubenstein Rare Book, Manuscript, & Special Collections Library, as they hold many of the materials that we digitize and archive on a daily basis. This collaboration requires a shared understanding of numerous media types and their special characteristics; awareness of potential conservation and preservation issues; and a working knowledge of digitization processes, logistics, and limitations.

In order to facilitate this ongoing collaboration, we recently did a semester-long cross-training course with The Rubenstein’s Reproductions Manager, Megan O’Connell. Megan is one of our main points of contact for weekly patron requests, and we felt that this training would strengthen our ability to navigate tricky and time-sensitive digitization jobs heading into the future. The plan was for Megan to work with all three of our digitization specialists (audio, video, & still image) to get a combination of hands-on and observational learning opportunities.

IMG_1068

Still image comprises the bulk of our workload, so we decided to spend most of the training on these materials. “Still image” includes anything that we digitize via photographic or scanning technology, e.g. manuscripts, maps, bound periodicals, posters, photographs, slides, etc. We identified a group of uniquely challenging materials of this type and digitized one of each for hands-on training, including:

  • Bound manuscript – Most of these items cannot be opened more than 90 degrees. We stabilize them in a custom-built book cradle, capture the recto sides of the pages, then flip the book and capture the verso sides. The resulting files then have to be interleaved into the correct sequence.
  • Map, or other oversize item – These types of materials are often too large to capture in one single camera shot. Our setup allows us to take multiple shots (with the help of the camera being mounted on a sliding track) which we then stitch together into a seamless whole.
  • Item with texture or different item depths, e.g. a folded map, tipped into a book – It is often challenging to properly support these items and level the map so that it is all in focus within the camera’s depth of field.
  • ANR volume – These are large, heavy volumes that typically contain older newspapers and periodicals. The paper can be very fragile and they have to be handled and supported carefully so as not to damage or tear the material.
  • Item with a tight binding w/ text that goes into the gutter – We do our best to capture all of the text, but it will sometimes appear to curve or disappear into the gutter in the resulting digital image.

IMG_1069

Working through this list with Megan, I was struck by the diversity of materials that we collect and digitize. The training process also highlighted the variety of tricks, techniques, and hacks that we employ to get the best possible digital transfers, given the limitations of the available technology and the materials’ condition. I came out of the experience with a renewed appreciation of the complexity of the digitization work we do in the DPC, the significance of the rare materials in the collection, and the excellent service that we are able to provide to researchers through the Rubenstein Library.

IMG_1065

Check out Megan’s blog post on the Devil’s Tale for more on the other media formats I wasn’t able to cover in the scope of this post.

Here’s to more collaboration across boundaries in the New Year!

IMG_1173

Vagrant Up

Ruby on Rails logo

Writing software for the web has come a long way in the past decade. Tools such as Ruby on Rails, Twitter Bootstrap, and jQuery provide standard methods for solving common problems and providing common features of web applications. These tools help developers use more of their time and energy to solve problems unique to the application they are building.

One of the costs of relying on libraries and frameworks to build software is that your project not only depends on the frameworks and libraries you’ve chosen to use, but these frameworks and libraries rely on still more components. Compounding this problem, software is never really finished. There are always bugs being fixed and features being changed and added. These attributes of software, that it changes and has dependencies, complicates software projects in a few different ways:

  • You must carefully manage the versions of libraries, frameworks, and other dependencies so that you ensure all the pieces work together.
  • Developers working on multiple projects may have to use different versions of the same libraries and packages for each of their projects.
  • If you’re working with a team of developers on a project all members of the team have to make sure their computers are set up with the correct versions of the software and dependencies of the project.

Thankfully, there are still more tools available to help manage these problems. For instance, Ruby Version Manager (RVM) is a popular tools used by Ruby on Rails developers. It lets the software developer install and switch between different versions of Ruby. Other tools, such as Bundler make it possible to define exactly what version of which Ruby Gems (Ruby software packages that add functionality to your own project) you need to install for a particular project. Combined, RVM and Bundler simplify the management of complex project dependencies. There are similar tools available for other programming languages, such as Composer, which is a dependency manager for PHP.

By Fco.plj (Own work) [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

While many of us already use dependency managers in our work, one tool we haven’t been using that we’re evaluating for use on a new project is Vagrant. Vagrant is a tool for creating virtual machines, self-contained systems that run within a host operating system. Virtual machines are software implementations of a computer system. For instance, using a virtual machine I could run Windows on my Mac hardware.

Vagrant does a few things that may make it even easier for developers to manage project dependencies.

  • With Vagrant you can write a script that contains a set of instructions about what operating system and other software you want to install in a virtual machine. Creating a virtual machine with all the software you need for a given project is as then as simple as typing a single command.
  • Vagrant provides a shared directory between your host operating system and the virtual machine. You can use the operating system you use everyday as you work while the software project runs in a virtual machine. This is significant because it means each developer can continue to use the operating system and tools they prefer while the software they’re building is all running in copies of the exact same system.
  • You can add the script for creating the virtual machine to the project itself making it very easy for new developers to get the project running. They don’t have to go through the sometimes painful process of installing a project’s dependencies by hand because the Vagrant script does it for them.
  • A developer working on multiple projects can have a virtual machine set up for each of their projects so they never interfere with each other and each has the correct dependencies installed.

Here’s how to use Vagrant in the most minimal way:

  1. Download and install VirtualBox
  2. Download and install Vagrant
  3. In a terminal window type:
    vagrant init hashicorp/precise32
  4. After running the following command you will have downloaded, set up, and started a fully functional virtual machine running Ubuntu:
    vagrant up
  5. You can then connect to and start using the running virtual machine by connecting to it via SSH:
    vagrant ssh

"Vagrantup" by Fco.plj - Own work. Licensed under CC BY-SA 3.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Vagrantup.jpg#mediaviewer/File:Vagrantup.jpg

In a more complex setup you’d probably add a provisioning script with instructions for downloading and installing additional software as part of the “vagrant up” process. See the Vagrant documentation for more details about provisioning options.

We’re considering using Vagrant on an upcoming project in an effort to make it easier for all the developers on the project to set up and maintain a working development environment. With Vagrant, just one developer will need to spend the time to create the script that generates the virtual machine for the project. This should save the time of other developers on the project who should only have to install VirtualBox, copy the Vagrant file and type “vagrant up.” At least, that’s the idea. Vagrant has great documentation, so if you’re interested in learning more their website is a good place to start.

Digital Transitions Roundtable

In late October of this year, the Digital Production Center (along with many others in the Library) were busy developing budgets for FY 2015. We were asked to think about the needs of the department, where the bottlenecks were and possible new growth areas. We were asked to think big. The idea was to develop a grand list and work backwards to identify what we could reasonably ask for. While the DPC is able to digitize many types of materials and formats, such as audio and video, my focus is specifically still image digitization. So that’s what I focused on.dt-bc100-book

We serve many different parts of the Library and in order to accommodate a wide variety of requests, we use many different types of capture devices in the DPC: high-speed scanners, film scanners, overhead scanners and high-end cameras. The most heavily used capture device is the Phase One camera system. This camera system uses P65 60 MP digital back with a 72mm Schneider flat field lens. This enables us to capture high quality images at archival standards. The majority of material we digitize using this camera are bound volumes (most of them rare books from the David M. Rubenstein Library), but we also use this camera to digitize patron requests, which have increased significantly over the years (everything is expected to be digital it seems), oversized items, glass plate negatives, high-end photography collections and much more. It is no surprise that this camera is a bottleneck for still image production. In researching cameras to include in the budget, I was hard pressed to find another camera system that can compete with the Phase One camera. For over 5 years we have used Digital Transitions, a New York-based provider of high-end digital solutions, for our Phase One purchases and support. We have been very happy with the service, support and equipment we have purchased from them over the years, so I contacted them to inquire about new equipment on the horizon and pricing for upgrading our current system.captureone

New equipment they turned me onto is the BC100 book scanner. This scanner uses a 100° glass platen and two reprographic cameras to capture two facing pages at the same time. While there are other camera systems that use a similar two camera setup (most notably the Scribe, Kirtas and Atiz), the cameras and digital backs used with the BC100, as well as the CaptureOne software that drives the cameras, are more well suited for cultural heritage reproduction. Along with the new BC100, CaptureOne is now offering a new software package specifically geared toward the cultural heritage community for use with this new camera system. While inquiring about the new system, I was invited to attend a Cultural Heritage Round Table event that Digital Transitions was hosting.

This roundtable was focused on the new CaptureOne software for use with the BC100 and the specific needs of the cultural heritage community. I have always found the folks at Digital Transitions to be very professional, knowledgeable and helpful. The event they put together included Jacob Frost, Application Software R&D Manager for PhaseOne; Doug Peterson, Technical Support, Training, R&D at Digital Transitions; and Don Williams of Image Science Associates, Imaging Scientist. Don is also on the Still Image Digitization Advisory Board with the Federal Agencies Digitization Guidelines Initiative (FADGI), a collaborative effort by federal agencies to FADGI1define common guidelines, methods, and practices for digitizing historical content. They talked about the new features of the software, the science behind the software, the science behind the color technology and new information about the FADGI Still Image standard that we currently follow at the Library. I was impressed by the information provided and the knowledge shared, but what impressed me the most was the fact that the main reason Digital Transitions pulled this particular group of users and developers together was to ask us what the cultural heritage community needed from the new software. WHAT!? What we need from the software? I’ve been doing this work for about 15 years now and I think that’s the first time any software developer from any digital imaging company has asked our community specifically what we need. Don’t get me wrong, there is a lot of good software out there but usually the software comes “as is.” While it is fully functional, there are usually some work-arounds to get the software to do what I need it to do. We, as a community, spent about an hour drumming up ideas for software improvements and features.

While we still need to see follow-through on what we talked about, I am hopeful that some of the features we talked about will show up in the software. The software still needs some work to be truly beneficial (especially in post-production), but Phase One and Digital Transitions are definitely on to something.

Assembling the Game of Stones

Back in October, Molly detailed DigEx’s work on creating an exhibit for the Link Media Wall. We’ve finally finalized our content and hope to have the new exhibit published to the large display in the next week or two. I’d like to detail how this thing is actually put together.

HTML Code

In our planning meetings the super group talked about a few different approaches for how to start. We considered using a CMS like WordPress or Drupal, Four Winds (our institutional digital signage software), or potentially rolling our own system. In the end though, I decided to build using super basic HTML / CSS / Javascript. After the group was happy with the design, I built a simple page page framework to match our desired output of 3840 x 1080 pixels. And when I mean simple, I mean simple.

got_assembly

I broke the content chunks into five main sections: the masthead (which holds the branding), the navigation (which highlights the current section and construction period), the map (which shows the location of the buildings), the thumbnail (which shows the completed building and adds some descriptive text), and the images (which houses a set of cross-fading historic photos illustrating the progression of construction). Working with a fixed-pixel layout feels strange in the modern world of web development, but it’s quick and satisfying to crank out. I’m using the jQuery Cycle plugin to transition the images, which is lightweight and offers lots of configurable options. I also created a transparent PNG file containing a gradient that fades to the background color which overlays the rotating images.

Another part of the puzzle I wrestled with was how to transition from one section of the exhibit to another. I thought about housing all of the content on a single page and using some JS to move from one to the next, but I was a little worried about performance so I again opted for the super simple solution. Each page has a meta refresh in the header set to the number of seconds that it takes to cycle through the corresponding set of images and with a destination of the next section of the exhibit. It’s a little clunky in execution and I would probably try something more elegant next time, but it’s solid and it works.

Here’s a preview of the exhibit cycling through all of the content. It’s been time compressed – the actual exhibit will take about ten minutes to play through.

In a lot of ways this exhibit is an experiment in both process and form, and I’m looking forward to seeing how our vision translates to the Media Wall space. Using such simple code means that if there are any problems, we can quickly make changes. I’m also looking forward to working on future exhibits and helping to highlight the amazing items in our collections.

New Angles & Avenues for Bitstreams

This week, we added a display of our most recent Bitstreams blog posts to our Digital Collections homepage (example), and likewise, a view of posts relevant to a given collection on the respective collection’s homepage (example).

Screen Shot 2014-11-12 at 1.19.56 PM

Background

Our Digital Projects & Production team has been writing in Bitstreams at least weekly since February 2014. We’ve had some excellent guest contributors, too. Some posts share updates about new digital collections or additions, while others share insights, lessons learned, and behind-the-scenes looks at the projects we’re currently tackling.

Many of our posts have been featured on our library homepage and library news site. But until now, we haven’t been able to display any of them—not even the ones about new digital collections—alongside the collections themselves. So, if you visited the DukEngineer collection in the past, you likely missed out on Melanie’s excellent overview, which puts the magazine in context and highlights the best of what’s inside.

Past Solutions

Syndicating tagged blog posts for display elsewhere is a pretty common use case, and we’ve used a bunch of different solutions as our platforms have evolved. Each solution has naturally been painstakingly tailored to accommodate the inner workings of both the source and the destination. Seven years ago, we were writing custom XSLT to create and then consume our own RSS feeds in Cascade Server CMS. We have since hopped over to Wordpress for managing news and blogs (whew!). An older version of our digital collections app used WordPress’ XML-RPC API to get tagged posts and parsed them with Python.

These days, our library website does blog syndication by using a combo of WordPress RSS, Drupal’s feed aggregator module, and occasionally Yahoo! Pipes for data mashing and munging. It works well in Drupal, but other platforms require other approaches.

Under the Hood: Angular.js and Wordpress JSON API

Bret Davidson’s Code4Lib 2014 presentation, Towards Pasta Code Nirvana: Using JavaScript MVC to Fill Your Programming Ravioli  (slides) made me hungry. Hungry for pasta, yes, but also for knowledge. I wanted to:

  1. Experiment with one of the Javascript MVC frameworks to learn how they work, and in the process…
  2. Build something potentially useful for digital collections that could be ported over to a new application framework in the future (e.g., from our current Django app to a future Ruby on Rails app).

From the many possibilities, I chose AngularJS. It seemed well-documented, increasingly popular, and with Google’s backing, it seems like it’ll be around for awhile.

WordPress JSON API

Among Angular’s virtues is that it really simplifies the process of getting and using JSON data from an API. I found Wordpress’ JSON API plugin, which was interestingly developed by staff at MoMA so they could use WordPress as a back-end to a site with a Rails front-end. So we first had to enable that for our Bitstreams blog.

AngularJS

angularjsAngularJS definitely helps keep code clean, especially by abstracting the model (the blogposts & associated characteristics, as well as the page state) from the view (indicates how to display the data) from the controller (gets and refines the data into the model, updates the model upon interactions with the view). I’ve done several projects in the past using jQuery and DOM manipulation to retrieve and display data. It usually works, but in the process I create a veritable rat’s nest of spaghetti code wherein /* no amount of commenting */ can truly help disentangle what’s happening.

Angular also supercharges HTML with more useful attributes to control a display. I’ve only just scratched the surface, but it’s clear that built-in directives like ng-repeat and filters like limitTo spare me from writing a ton of Javascript, e.g., <li ng-repeat="post in blogposts | limitTo:pageSize">. After the initial learning curve, the markup is visually intuitive. And it’s nice that directives and filters are extensible so you can make your own.

Source code: controller js, HTML (view source)

Initial Lessons Learned

  • AngularJS has a steeper learning curve than I’d expected; I assumed I could do this mini-project in a few hours, but it took a couple days to really get a handle on the basic pieces I needed for this project.
  • Writing an Angular app within a Django app is tricky. Both use {{ variable }} template tags so I had to change Angular to use [[ variable ]] instead.

Looking Ahead

I consider this an encouraging proof of concept. While our own blog posts can be interesting, there are many other sources of valuable data out in the world that are relevant to our collections that would add value for our researchers if we were able to easily get and display them. AngularJS won’t be the answer to all of these needs, but it’s nice to have in the toolset.

Dispatches from the Digital Library Federation Forum

On October 27-29 librarians, archivists, developers, project managers, and others met for the Digital Library Federation (DLF) Forum in Atlanta, GA. The program was packed to the gills with outstanding projects and presenters, and several of us from Duke University Libraries were fortunate enough to attend.  Below is a round up of notes summarizing interesting sessions, software tools, projects and collections we learned about at the conference.

Please note that these notes were written by humans listening to presentations and mistakes are inevitable.  Click the links to learn more about each tool/project or session straight from the source.

Tools and Technology

Spotlight is an open-source tool for featuring digitized resources and is being developed at Stanford University.  It appears to have fairly similar functionality to Omeka, but is integrated into Blacklight, a discovery interface used by a growing number of libraries.

 

The J. Williard Marriott Library at the University of Utah presented on their use of Pamco Imaging tools to capture 360 degree images of artifacts.  The library purchased a system from Pamco that includes an automated turntable, lighting tent and software to both capture and display the 3-D objects.

 

There were two short presentations about media walls; one from our friends in Raleigh at the Hunt Library at N.C. State University, and the second from Georgia State.  Click the links to see just how much you can do with an amazing media wall.

Projects and Collections

The California Digital Library (CDL) is redesigning and reengineering their digital collections interface to create a kind of mini-Digital Public Library of America just for University of California digital collections.  They are designing the project using a platform called Nuxeo and storing their data through Amazon web services.  The new interface and platform development is highly informed by user studies done on the existing Calisphere digital collections interface.

 

Emblematica Online is a collection of  digitized emblem books contributed by several global institutions including Duke. The collection is hosted by University of Illinois at Urbana Champagne.  The project has been conducting user studies and hope to publish them in the coming year.

 

The University of Indiana Media Digitization and Preservation Initiative started in 2009 with a survey of all the audio and visual materials on campus.  In 2011, the initiative proposed digitizing all rare and unique audio and video items within a 15 year period. However in 2013, the President of the University said that the campus would commit to completing the project in a 7 year period.   To accomplish this ambitious goal, the university formed a public-private partnership with Memnon Archiving Services of Brussels. The university estimates that they will create over 9 petabytes of data. The initiative has been in the planning phases and should be ramping up in 2015.

Selected Session Notes

The Project Managers group within DLF organized a session on “Cultivating a Culture of Project Management” followed by a working lunch. Representatives from John’s Hopkins and Brown talked about implementing Agile Methodology for managing and developing technical projects.  Both libraries spoke positively about moving towards Agile, and the benefits of clear communication lines and defined development cycles.  A speaker from Temple university discussed her methods for tracking and communicating the capacity of her development team; her spreadsheet for doing so took the session by storm (I’m not exaggerating – check out Twitter around the time of this session).   Two speakers from the University of Michigan shared their work in creating a project management special interest group within their library to share PM skills, tools and heartaches.

A session entitled “Beyond the digital Surrogate” highlighted the work of several projects that are using digitized materials as a starting point for text mining and visualizing data.  First, many of UNC’s Documenting the American South collections are available as a text download.  Second, a tool out of Georgia Tech supports interactive exploration and visualization of text based archives.  Third, a team from University of Nebraska-Lincoln is developing methods for using visual information to leverage discovery and analysis of digital collections.

 

Assessment

“Moving Forward with Digital Library Assessment.” Based around the need to strategically focus our assessment efforts in digital libraries and to better understand and measure the value, impact, and associated costs of what we do. 

Community notes for this session

  • Joyce Chapman, Duke University
  • Jody DeRidder, University of Alabama
  • Nettie Lagace, National Information Standards Organization
  • Ho Jung Yoo, University of California, San Diego

Nettie Legace: update on NISO’s altmetrics initiative.

  • The first phase exposed areas for potential standardization. The community then collectively prioritized those potential projects, and the second phase is now developing those best practices. A Working group is developed, its recommendation due June 2016.
  • Alternative Metrics Initiative Phase 1 White Paper 

Joyce Chapman: a framework for estimating digitization costs

Jody DeRidder and Ho Jung Yoo: usability testing

  • What critical aspects need to be addressed by a community of practice?
  • What are next steps we can take as a community?

Midnight in the Garden of Film and Video

A few weeks ago, archivists, engineers, students and vendors from across the globe arrived in the historic city of Savannah, GA for AMIA 2014. The annual conference for The Association of Moving Image Archivists is a gathering of professionals who deal with the challenge of preserving motion picture film and videotape content for future generations. Since today is Halloween, I must also point out that Savannah is a really funky city that is haunted! The downtown area is filled with weeping willow trees, well-preserved 19th century architecture and creepy cemeteries dating back to the U.S. Civil and Revolutionary wars. Savannah is almost as scary as a library budget meeting.

The bad moon rises over Savannah City Hall.
The bad moon rises over Savannah City Hall.

Since many different cultural heritage institutions are digitizing their collections for preservation and online access, it’s beneficial to develop universal file standards and best practices. For example, organizations like NARA and FADGI have contributed to the universal adoption of the 8-bit uncompressed TIFF file format for (non-transmissive) still image preservation. Likewise, for audio digitization, 24-bit uncompressed WAV has been universally adopted as the preservation standard. In other words, when it comes to still image and audio digitization, everyone is driving down the same highway. However, at AMIA 2014, it was apparent there are still many different roads being taken in regards to moving image preservation, with some potential traffic jams ahead. Are you frightened yet? You should be!

The smallest known film gauge: 3mm. Was it designed by ancient druids?
The smallest known film gauge: 3mm. Was it built by ancient druids?

Up until now, two file formats have been competing for dominance for moving image preservation: 10-bit uncompressed (.mov or .avi wrapper) vs. Motion JPEG2000 (MXF wrapper). The disadvantage of uncompressed has always been its enormous file size. Motion JPEG2000 incorporates lossless compression, which can reduce file sizes by 50%, but it’s expensive to implement, and has limited interoperability with most video software and players. At AMIA 2014, some were championing the use of a newer format, FFV1, a lossless codec that has compression ratios similar to JPEG2000, but is open source, and thus more widely adoptable. It is part of the FFmpeg software project. Adoption of FFV1 is growing, but many institutions are still heavily invested in 10-bit uncompressed or Motion JPEG2000. Which format will become the preservation standard, and which will become ghosts that haunt us forever?!?

Another emerging need is for content management systems that can store and provide public access to digitized video. The Hydra repository solution is being adopted by many institutions for managing preservation video files. In conjunction with Hydra, many are also adopting Avalon to provide public access for online viewing of video content. Like FFmpeg, both Hydra and Avalon are open source, which is part of their appeal. Others are building their own systems, catered specifically to their own needs, like The Museum of Modern Art. There are also competing metadata standards. For example, PBCore has been adopted by many public television stations, but is generally disliked by libraries. In fact, they find it really creepy!

A new print of Peter Pan was shown at AMIA 2014
A new print of Peter Pan was shown at AMIA 2014. That movie gave me nightmares as a child.

Finally, there is the thorny issue of copyright. Once file formats are chosen and delivery systems are in place, methods must be implemented to control access by only those intended, to protect copyright and hinder piracy. The Avalon Media System enables rights and access control to video content via guest passwords. The Library of Congress works around some of these these issues another way, by setting up remote viewing rooms in Washington, DC, which are connected via fiber-optic cable to their Audio-Visual Conservation Center in Culpeper, Va. Others, with more limited budgets, like Dino Everett at USC Cinematic Arts, watermark their video, upload it to sites like Vimeo, and implement temporary password protection, canceling the passwords manually after a few weeks. I mean, is there anything more frightening than a copyright lawsuit? Happy Halloween!