We’re embarking on a project to adopt or build a metadata tool at Duke University Libraries. Before we’re immersed in architectures, designs, workflows, schedules, layers, platforms, capacities, etc., I’d like to indulge in some guilt-free big thinking. I thought I’d just kind of put the question out there: What are some of the big ideas that could inform the development of a metadata tool?
I invite conversation here and on the web4lib and code4lib lists, to which I’m sending an abridged version of this post. Other conversations will occur in various venues over the next month or so. I’ll try to pull together and post on anything I see, hear, read or say. In the meantime, I’ll share one big idea that I’ve been considering; I’m not saying it’s THE big idea or even implying that we’ll follow through on it at Duke. It’s just one way to bend our thinking about this project. I’m interested in other ideas that can help with the bending of the thinking on the project for the tool for the metadata.
The idea that I’m posing follows from a blog post that Lorcan Dempsey wrote in May, mentioning an example of a “shared cataloging environment”. When I read it, I wondered, what if you take that idea to its logical (illogical?) extreme: a metadata tool as a software-as-a-service (SaaS) platform.
In this scenario, there’s a negligible barrier to entry. Let’s say that “anyone” can sign up, as with flickr or youtube, and use it to describe any collection of things, including things that have URL’s (flickr images) and things that don’t (vinyl LP’s on my shelf, bulbs planted in my flower bed). As a user, nothing that I’m describing with my metadata actually resides on the server with the platform. I’m just making records, which point to something online, or refer to something either on my hard drive or my shelf (or stuck to a contact sheet in a binder, or arranged in a box, etc.).
A Service-Oriented Architecture (SOA) allows me to embed online resources (flickr images, youtube videos) in the display of my metadata records. Whenever I want, I can harvest my metadata in a variety of forms. The platform also has features for creating and sharing custom metadata schemas and authority lists and publishing, sharing and organizing the contents of collections. If you further imagine extending that SOA to embed the output of digitization workflows, then the digital library or digital collections applications enter the scenario. But I’m going to provide a couple of usage scenarios not related to libraries, just for now.
Scenario 1: OTRR, or stuff that’s not online (or is, either way)
A colleague of mine at Duke Libraries, Randy Riddle, collects old-time radio transcriptions. He writes a blog about his collection, builds podcasts, and participates in a community, the Old Time Radio Researchers Group. The community maintains a wiki site, and embedded on many of the pages are structured metadata records. See, for example, the pages on “The Adventures of Superman.”
Randy sat down with me and provided a list of the metadata fields that he thinks the community might use to track and share their collections, and it’s fairly extensive: Series Title, Episode Title, Alternative Title, Date of Original Broadcast, Episode Number, Personnel (writer, actor, producer), Synopsis, Publisher (network, local show, syndicating company), Running Time, Sponsor, and then a number of fields for information specific to the disc (Material, Matrix Number, Pressing Company) and to the generation of a recording or dub if it’s not a disc.
A collector’s community like OTRR could collaborate on a metadata schema, implement it in our hypothetical SaaS metadata tool, and build their own authority lists. They could maintain their individual collections, pool them, and potentially build other services around the metadata. They could link to digitized versions of items in their collections if they exist, or they can enter “metadata only” records.
Scenario 2: birdwatchers, or stuff that’s online (or maybe not, either way)
Let’s say I enjoy birdwatching. I go out on weekends to take photographs and upload them to flickr, where I share them with other birdwatchers. There exist a number of such groups on flickr. But flickr only gives my group a couple of metadata fields to work with, and besides, some of my friends shoot video and others record high-quality audio. We’d like to be able to catalog these resources with a little more precision: Species, Place, Date and Media Format. Just those four fields would provide a lot of discovery power for our community.
So we post our pictures and videos to flickr and youtube, but we can also sign in to this metadata tool and create a collection that we’re all going to share. We collaborate on our metadata profile and populate the authority lists with species names. We might even practice “Wikithority” — linking terms to their entries on Wikipedia (“Sialia mexicana”: http://en.wikipedia.org/wiki/Western_bluebird). Whenever I want to add a new item to my personal collection — which is a member of the “Birdwatchers” community — it defaults to the “birdwatchers” metadata profile.
Let’s imagine further that the platform even performs some discovery functions: keyword searching, faceted browsing, and maybe “advanced search.” It has a front end that looks something like the “Tripod” system we developed for digital collections at Duke. So when I’m under everyonesmetadatatool.org/groups/birdwatchers, I’ll see facets from the birdwatchers profile.
Meanwhile, libraries can use the tool for digitization projects by forming their own communities and importing records from their digitization workflows. The scenario can use some fleshing out, possibly in a follow-up post.
What if the libraries developed a tool that provides this service for any online community? Would it position the library in the midst of the social networking and online community-building culture where, we believe, research and the exchange of ideas is actually occuring? It seems to me that this kind of environment puts the library in the position of staking its own claim on the basic idea of online collections, something that I think has been co-opted by the big SaaS platforms like flickr.
I’d appreciate any feedback on this idea, including, “You could never make it work, Duke!” (along with reasons why). I’d also appreciate any other “big thinking” that folks might have to offer on the possibilities for a metadata tool platform.