I attended the CNI Spring Task Force Meeting in Minneapolis, April 6-7, 2009. Below are some takeaways that I found noteworthy, especially as they relate to repositories.
Keynote Address – David Rosenthal, Chief Scientist, LOCKSS, Stanford University: David challenged some of the prevailing thought on digital preservation regarding format obsolescence. He stated that incompatibility is not inevitable, rather that “creating incompatibility = reinventing the wheel”. He argued that format obsolescence never happens. He backed this up with evidence from the last few decades. The moral of the story: If we go ahead and just collect the bits, we will be fine. A rather freeing thought, given that the perceived complexities often make digital preservation a non-starter.
JPEG2000 is a viable alternative: Ryan Chute, from Los Alamos National Library, demonstrated the Djatoka (pronounced jay-too-kay), which is an open source JPEG2000 image server, built with the Kakadu software library. The Djatoka server now has two client implementations (IIP implementation at the Biodiversity Heritage Library, and Open Layers at UNC). Conceivably, JPEG2000 could be used as both a presentation format and as a preservation format (lossless compression around 2:1 and visually lossless compression around 10:1 from tiffs). Demonstration looked very sharp, will need to pay attention to how it performs in production environments. Discussed with Ryan the plans for integration with Fedora, and there are a few implementation paths to evaluate.
Preservation services in the clouds, Duraspace: Sandy Payette and Michele Kimpton discussed the joint venture between Fedora Commons and Dspace Foundation. Duraspace will be a service (eventually a set of services) as well as open source software. The initial use case will allow for a preservation based service in the cloud. They have identified a few sites that they will be piloting these services with. By Q1 2010, they expect to have extensions available for Fedora and Dspace to plug into these cloud services. I asked about a scenario where we might store preservation copies in the cloud and store derivatives locally, and have Fedora and Akubra broker the data to the right store; they said this is a scenario they are planning for.
Cool Book Digitization Workflow at Northwestern: I attended a presentation by Claire Stewart and Steve DiDomenico from Northwestern on their web-based book digitization workflow, codename “crabcake”. They are digitizing books and ingesting into Fedora. Their Fedora implementation is similar to ours with an atomistic content model and use of METS for structural metadata. Very clean set of workflow tools. The most impressive part of their presentation is their GUI for manipulating the METS structure for a book digital object. This interface is built heavily with Ext JS. Their project is grant funded, and they will be releasing as open source in the summer. From what I can tell, installation of their tools may require some adoption of their local practices, at the very least, their interpretation of METS. Regarding their digitization/QC process, they have a lot of throughput, they push things into Fedora with very little human intervention and fix later, in essence getting things online with very little impediment.
Trident project report: I gave an update on the Trident project. The presentation was well attended, and the project was well received. There was good discussion around the metadata application profile, its possible extension to different metadata schemas, and general use cases for the Editor. There was a general validation that our project continues to head in the right direction.