International Broadsides (added to migrated Broadsides and Ephemera collection): https://repository.duke.edu/dc/broadsides
Orange County Tax List Ledger, 1875: https://repository.duke.edu/dc/orangecountytaxlist
Radio Haiti Archive, second batch of recordings: https://repository.duke.edu/dc/radiohaiti
William Gedney Finished Prints and Contact Sheets (newly re-digitized with new and improved metadata): https://repository.duke.edu/dc/gedney
In addition to the brand new items, the digital collections team is constantly chipping away at the digital collections migration. Here are the latest collections to move from Tripod 2 to the Duke Digital Repository (these are either available now or will be very soon):
What we hoped would be a speedy transition is still a work in progress 2 years later. This is due to a variety of factors one of which is that the work itself is very complex. Before we can move a collection into the digital repository it has to be reviewed, all digital objects fully accounted for, and all metadata remediated and crosswalked into the DDR metadata profile. Sometimes this process requires little effort. However other times, especially with older collection, we have items with no metadata, or metadata with no items, or the numbers in our various systems simply do not match. Tracking down the answers can require some major detective work on the part of my amazing colleagues.
Despite these challenges, we eagerly press on. As each collection moves we get a little closer to having all of our digital collections under preservation control and providing access to all of them from a single platform. Onward!
The initial thought I had for this blog post was to describe a slice of my day that revolved around the work of William Gedney. I was going to spin a tale about being on the hunt for a light meter to take lux (luminance) readings used to help calibrate the capture environment of one of our scanners. On my search for the light meter I bumped into the new exhibit of William Gedney’s handmade books displayed in the Chappell Family Gallery in the Perkins Library. I had digitized a number of these books a few months ago and enjoyed pretty much every image in the books. One of the books on display was opened to a particular photograph. To my surprise, I had just digitized a finished print of the same image that very morning while working on a larger project to digitize all of Gedney’s finished prints, proof prints, contact sheets and other material. Once the project is complete (a year or so from now) I will have personally seen, handled and digitized over 20,000 of Gedney’s photographs. Whoa! Would I be able to recognize Gedney images whenever one presented itself just like the book in the gallery? Maybe.
Once the collection is digitized and published through Duke Digital Collections the whole world will be able to see this amazing body of work. Instead of boring you with the details of that story I thought I would just leave you with a few images from the collection. For me, many of Gedney’s photographs have a kinetic energy to them. It seems as if I can almost feel the air. My imagination may be working overtime to achieve this and the reality of what was happening when the photograph was taken may be wholly different but the fact is these photographs spin up my imagination and transport me to the moments he has captured. These photographs inspire me to dust off my enlarger and set up a darkroom.
It may take some time to complete this particular project but there are other William Gedney related projects, materials and events available at Duke.
Life in Duke University Libraries has been even more energetic than usual these past months. Our neighbors in Rubenstein just opened their newly renovated library and the semester is off with a bang. As you can read over on Devil’s Tale, a lot of effort went on behind the scenes to get that sparkly new building ready for the public. In following that theme, today I am sharing some thoughts on how producing digital collections both blesses and curses my perspective on our finished products.
When I write a Bitstreams post, I look for ideas in my calendar and to-do list to find news and projects to share. This week I considered writing about “Ben”, those prints/negs/spreadsheets, and some resurrected proposals I’ve been fostering (don’t worry, these labels shouldn’t make sense to you). I also turned to my list of favorite items in our digital collections; these are items I find particularly evocative and inspiring. While reviewing my favorites with my possible topics in mind (Ben, prints/negs/spreadsheets, etc), I was struck by how differently patrons and researchers must relate to Duke Digital Collections than I do. Where they see a polished finished product, I see the result of a series of complicated tasks I both adore and would sometimes prefer to disregard.
Let me back up and say that my first experience with Duke digital collections projects isn’t always about content or proper names. Someone comes to me with an idea and of course I want to know about the significance of the content, but from there I need to know what format? How many items? Is the collection processed? What kind of descriptive data is available? Do you have a student to loan me? My mind starts spinning with logistics logistics logistics. These details take on a life of their own separate from the significant content at hand. As a project takes off, I come to know a collection by its details, the web of relationships I build to complete the project, and the occasional nickname. Lets look at a few examples.
Parts of this collection are published, but we are expanding and improving the online collection dramatically.
What the public sees: poignant and powerful images of everyday life in an array of settings (Brooklyn, India, San Francisco, Rural Kentucky, and others).
What I see: 50,000 items in lots of formats; this project could take over DPC photographic digitization resources, all publication resources, all my meetings, all my emails, and all my thoughts (I may be over dramatizing here just a smidge). When it all comes together, it will be amazing.
Benjamin Rush Papers We have just begun working with this collection, but the Devil’s Tale blog recently shared a sneak preview.
What people will see: letters to and from fellow founding fathers including Thomas Jefferson (Benjamin Rush signed the Declaration of Independence), as well as important historical medical accounts of a Yellow Fever outbreak in 1793.
What I see: Ben or when I’m really feeling it, Benny. We are going to test out an amazing new workflow between ArchivesSpace and DPC digitization guides with Ben.
This collection of photographs was published in 2008. Since then we have added more images to it, and enhanced portions of the collection’s metadata.
What others see: a striking portfolio of a Southern itinerant photographer’s portraits featuring a diverse range of people. Mangum also had a studio in Durham at the beginning of his career.
What I see: HMP. HMP is the identifier for the collection included in every URL, which I always have to remind myself when I’m checking stats or typing in the URL (at first I think it should be Mangum). HMP is sneaky, because every now and then the popularity of this collection spikes. I really want more people to get to know HMP.
The orphans are not literal children, but they come in all size and shapes, and span multiple collections.
What the public sees: the public doesn’t see these projects.
What I see: orphans – plain and simple. The orphans are projects that started, but then for whatever reason didn’t finish. They have complicated rights, metadata, formats, or other problems that prevent them from making it through our production pipeline. These issues tend to be well beyond my control, and yet I periodically pull out my list of orphans to see if their time has come. I feel an extra special thrill of victory when we are able to complete an orphan project; the Greek Manuscripts are a good example. I have my sights set on a few others currently, but do not want to divulge details here for fear of jinxing the situation.
I could go on and on about how the logistics of each project shapes and re-shapes my perspective of it. My point is that it is easy to temporarily lose sight of the digital collections garden given how entrenched (and even lost at times) we are in the weeds. For my part, when I feel like the logistics of my projects are overwhelming, I go back to my favorites folder and remind myself of the beauty and impact of the digital artifacts we share with the world. I hope the public enjoys them as much as I do.
We experience a number of different cycles in the Digital Projects and Production Services Department (DPPS). There is of course the project lifecycle, that mysterious abstraction by which we try to find commonalities in work processes that can seem unique for every case. We follow the academic calendar, learn our fate through the annual budget cycle, and attend weekly, monthly, and quarterly meetings.
The annual reporting cycle at Duke University Libraries usually falls to departments in August, with those reports informing a master library report completed later. Because of the activities and commitments around the opening of the Rubenstein Library, the departments were let off the hook for their individual reports this year. Nevertheless, I thought I would use my turn in the Bitstreams rotation to review some highlights from our 2014-15 cycle.
In a recent feature on their blog, our colleagues at NCSU Libraries posted some photographs of dogs from their collections. Being a person generally interested in dogs and old photographs, I became curious where dogs show up in Duke’s Digital Collections. Using very unsophisticated methods, I searched digital collections for “dogs” and thought I’d share what I found.
Of the 60 or so collections in Digital Collections 19 contain references to dogs. The table below lists the collections in which dogs or references to dogs appear most frequently.
Children are smoking in two of my favorite images from our digital collections.
One of them comes from the eleven days in 1964 that William Gedney spent with the Cornett family in Eastern Kentucky. A boy, crusted in dirt, clutching a bent-up Prince Albert can, draws on a cigarette. It’s a miniature of mawkish masculinity that echoes and lightly mocks the numerous shots Gedney took of the Cornett men, often shirtless and sitting on or standing around cars, smoking.
At some point in the now-distant past, while developing and testing our digital collections platform, I stumbled on “smoking dirt boy” as a phrase to use in testing for cases when a search returns only a single result. We kind of adopted him as an unofficial mascot of the digital collections program. He was a mini-meme, one we used within our team to draw chuckles, and added into conference presentations to get some laughs. Everyone loves smoking dirt boy.
It was probably 3-4 years ago that I stopped using the image to elicit guffaws, and started to interrogate my own attitude toward it. It’s not one of Gedney’s most powerful photographs, but it provokes a response, and I had become wary of that response. There’s a very complicated history of photography and American poverty that informs it.
While preparing this post, I did some research into the Cornett family, and came across the item from a discussion thread on a genealogy site, shown here in a screen cap. “My Mother would not let anyone photograph our family,” it reads. “We were all poor, most of us were clean, the Cornetts were another story.” It captures the attitudes that intertwine in that complicated history. The resentment toward the camera’s cold eye on Appalachia is apparent, as is the disdain for the family that implicitly wasn’t “clean,” and let the photographer shoot. These attitudes came to bear in an incident just this last spring, in which a group in West Virginia confronted traveling photographers whom they claimed photographed children without permission.
Gedney’s photographs have taken on a life as a digital collection since they were published on the Duke University Libraries’ web site in 1999. It has become a high-use collection for the Rubenstein Library; that use has driven a recent project we have undertaken in the library to re-process the collection and digitize the entire corpus of finished prints, proof prints, and contact sheets. We expect the work to take more than a year and produce more than 20,000 images (compared to the roughly 5000 available now), but when it’s complete, it should add whole new dimensions to the understanding of Gedney’s work.
Another collection given life by its digitization is the Sidney Gamble Photographs. The nitrate negatives are so flammable that the library must store them off site, making access impossible without some form of reproduction. Digitization has made it possible for anyone in the world to experience Gamble’s remarkable documentation of China in the early 20th Century. Since its digitization, this collection has been the subject of a traveling exhibit, and will be featured in the Photography Gallery of the Rubenstein Library’s new space when it opens in August.
The photograph of the two boys in the congee distribution line is another favorite of mine. Again, a child is seen smoking in a context that speaks of poverty. There’s plenty to read in the picture, including the expressions on the faces of the different boys, and the way they press their bowls to their chests. But there are two details that make this image rich with implicit narrative – the cigarette in the taller boy’s mouth, and the protective way he drapes his arm over the shorter one. They have similar, close-cropped haircuts, which are also different from the other boys, suggesting they came from the same place. It’s an immediate assumption that the boys are brothers, and the older one has taken on the care and protection of the younger.
Still, I don’t know the full story, and exploring my assumptions about the congee line boys might lead me to ask probing questions about my own attitudes and “visual definition” of the world. This process is one of the aspects of working with images that makes my work rewarding. Smoking dirt boy and the congee line boys are always there to teach me more.
Before you let your eyes glaze over at the thought of metadata, let me familiarize you with the term and its invaluable role in the creation of the library’s online Digital Collections. Yes, metadata is a rather jargony word librarians and archivists find themselves using frequently in the digital age, but it’s not as complex as you may think. In the most simplistic terms, the Society of American Archivists defines metadata as “data about data.” Okay, what does that mean? According to the good ol’ trusty Oxford English Dictionary, it is “data that describes and gives information about other data.” In other words, if you have a digitized photographic image (data), you will also have words to describe the image (metadata).
Better yet, think of it this way. If that image were of a large family gathering and grandma lovingly wrote the date and names of all the people on the backside, that is basic metadata. Without that information those people and the image would suddenly have less meaning, especially if you have no clue who those faces are in that family photo. It is the same with digital projects. Without descriptive metadata, the items we digitize would hold less meaning and prove less valuable for researchers, or at least be less searchable. The better and more thorough the metadata, the more it promotes discovery in search engines. (Check out the metadata from this Cornett family photo from the William Gedney collection.)
The term metadata was first used in the late 1960s in computer programming language. With the advent of computing technology and the overabundance of digital data, metadata became a key element to help describe and retrieve information in an automated way. The use of the word metadata in literature over the last 45 years shows a steeper increase from 1995 to 2005, which makes sense. The term became used more and more as technology grew more widespread. This is reflected in the graph below from Google’s Ngram Viewer, which scours over 5 million Google Books to track the usage of words and phrases over time.
Because of its link with computer technology, metadata is widely used in a variety of fields that range from computer science to the music industry. Even your music playlist is full of descriptive metadata that relates to each song, like the artist, album, song title, and length of audio recording. So, libraries and archives are not alone in their reliance on metadata. Generating metadata is an invaluable step in the process of preserving and documenting the library’s unique collections. It is especially important here at the Digital Production Center (DPC) where the digitization of these collections happens. To better understand exactly how important a role metadata plays in our job, let’s walk through the metadata life cycle of one of our digital projects, the Duke Chapel Recordings.
The Chapel Recordings project consists of digitizing over 1,000 cassette and VHS tapes of sermons and over 1,300 written sermons that were given at the Duke Chapel from the 1950s to 2000s. These recordings and sermons will be added to the existing Duke Chapel Recordings collection online. Funded by a grant from the Lilly Foundation, this digital collection will be a great asset to Duke’s Divinity School and those interested in hermeneutics worldwide.
Before the scanners and audio capture devices are even warmed up at the DPC, preliminary metadata is collected from the analog archival material. Depending on the project, this metadata is created either by an outside collaborator or in-house at the DPC. For example, the Duke Chronicle metadata is created in-house by pulling data from each issue, like the date, volume, and issue number. I am currently working on compiling the pre-digitization metadata for the 1950s Chronicle, and the spreadsheet looks like this:
As for the Chapel Recordings project, the DPC received an inventory from the University Archives in the form of an Excel spreadsheet. This inventory contained the preliminary metadata already generated for the collection, which is also used in Rubenstein Library‘s online collection guide.
The University Archives also supplied the DPC with an inventory of the sermon transcripts containing basic metadata compiled by a student.
Here at the DPC, we convert this preliminary metadata into a digitization guide, which is a fancy term for yet another Excel spreadsheet. Each digital project receives its own digitization guide (we like to call them digguides) which keeps all the valuable information for each item in one place. It acts as a central location for data entry, but also as a reference guide for the digitization process. Depending on the format of the material being digitized (image, audio, video, etc.), the digitization guide will need different categories. We then add these new categories as columns in the original inventory spreadsheet and it becomes a working document where we plug in our own metadata generated in the digitization process. For the Chapel Recordings audio and video, the metadata created looks like this:
Once we have digitized the items, we then run the recordings through several rounds of quality control. This generates even more metadata which is, again, added to the digitization guide. As the Chapel Recordings have not gone through quality control yet, here is a look at the quality control data for the 1980s Duke Chronicle:
Once the digitization and quality control is completed, the DPC then sends the digitization guide filled with metadata to the metadata archivist, Noah Huffman. Noah then makes further adds, edits, and deletes to match the spreadsheet metadata fields to fields accepted by the management software, CONTENTdm. During the process of ingesting all the content into the software, CONTENTdm links the digitized items to their corresponding metadata from the Excel spreadsheet. This is in preparation for placing the material online. For even more metadata adventures, see Noah’s most recent Bitstreams post.
In the final stage of the process, the compiled metadata and digitized items are published online at our Digital Collections website. You, the researcher, history fanatic, or Sunday browser, see the results of all this work on the page of each digital item online. This metadata is what makes your search results productive, and if we’ve done our job right, the digitized items will be easily discovered. The Chapel Recordings metadata looks like this once published online:
Further down the road, the Duke Divinity School wishes to enhance the current metadata to provide keyword searches within the Chapel Recordings audio and video. This will allow researchers to jump to specific sections of the recordings and find the exact content they are looking for. The additional metadata will greatly improve the user experience by making it easier to search within the content of the recordings, and will add value to the digital collection.
On this journey through the metadata life cycle, I hope you have been convinced that metadata is a key element in the digitization process. From preliminary inventories, to digitization and quality control, to uploading the material online, metadata has a big job to do. At each step, it forms the link between a digitized item and how we know what that item is. The life cycle of metadata in our digital projects at the DPC is sometimes long and tiring. But, each stage of the process creates and utilizes the metadata in varied and important ways. Ultimately, all this arduous work pays off when a researcher in our digital collections hits gold.
We’re continually walking through doorways or passing them by, but how often do we linger to witness the life that unfolds nearby? Let the photographs below be your doorway, connecting you with lives lived in other places and times.
We’ve written many posts on this blog that describe (in detail) how we build our digital collections at Duke, how we describe them, and how we make them accessible to researchers.
At a Rubenstein Library staff meeting this morning one of my colleagues–Sarah Carrier–gave an interesting report on how some of our researchers are actually using our digital collections. Sarah’s report focused specifically on permission-to-publish requests, that is, cases where researchers requested permission from the library to publish reproductions of materials in our collection in scholarly monographs, journal articles, exhibits, websites, documentaries, and any number of other creative works. To be clear, Sarah examined all of these requests, not just those involving digital collections. Below is a chart showing the distribution of the types of publication uses.
What I found especially interesting about Sarah’s report, though, is that nearly 76% of permission-to-publish requests did involve materials from the Rubenstein that have been digitized and are available in Duke Digital Collections. The chart below shows the Rubenstein collections that generate the highest percentage of requests. Notice that three of these in Duke Digital Collections were responsible for 40% of all permission-to-publish requests:
So, even though we’ve only digitized a small fraction of the Rubenstein’s holdings (probably less than 1%), it is this 1% that generates the overwhelming majority of permission-to-publish requests.
I find this stat both encouraging and discouraging at the same time. On one hand, it’s great to see that folks are finding our digital collections and using them in their publications or other creative output. On the other hand, it’s frightening to think that the remainder of our amazing but yet-to-be digitized collections are rarely if ever used in publications, exhibits, and websites.
I’m not suggesting that researchers aren’t using un-digitized materials. They certainly are, in record numbers. More patrons are visiting our reading room than ever before. So how do we explain these numbers? Perhaps research and publication are really two separate processes. Imagine you’ve just written a 400 page monograph on the evolution of popular song in America, you probably just want to sit down at your computer, fire up your web browser, and do a Google Image Search for “historic sheet music” to find some cool images to illustrate your book. Maybe I’m wrong, but if I’m not, we’ve got you covered. After it’s published, send us a hard copy. We’ll add it to the collection and maybe we’ll even digitize it someday.
[Data analysis and charts provided by Sarah Carrier – thanks Sarah!]
We try to keep our posts pretty focussed on the important work at hand here at Bitstreams central, but sometimes even we get distracted (speaking of, did you know that you can listen to the Go-Gos for hours and hours on Spotify?). With most of our colleagues in the library leaving for or returning from vacation, it can be difficult to think about anything but exotic locations and what to do with all the time we are not spending in meetings. So this week, dear reader, we give you a few snapshots of vacation adventures told through Duke Digital Collections.
Many of Duke’s librarians (myself included) head directly East for a few days of R/R at the one of many beautiful North Carolina beaches. Who can blame them? It seems like everyone loves the beach including William Gedney, Deena Stryker, Paul Kwilecki and even Sydney Gamble. Lucky for North Carolina, the beach is only a short trip away, but of course there are essentials that you must not forget even on such a short journey.
Of course many colleagues have ventured even farther afield to West Virginia, Minnesota, Oregon, Maine and even Africa!! Wherever our colleagues are, we hope they are enjoying some well deserved time-off. For those of us who have already had our time away or are looking forward to next time, we will just have to live vicariously through our colleagues’ and our collections’ adventures.
Notes from the Duke University Libraries Digital Projects Team