In what now seems like the way distant past, just before the library building closed, Ryan Baumann and I virtually presented on DUL’s work with multispectral imaging at the More than Meets the Eye conference, hosted by the University of Iowa. The three-day conference was a wonderful opportunity to hear examples from around the world on the application of enhanced digital imaging technologies in research on cultural heritage.
This month the library kicked off a weekly “Lunch & Learn” series, in which library staff give short (~20 min) presentations about their recent work or research interests. It provides an additional opportunity to connect with our colleagues and learn something new each week. Since I already had my slides from the Iowa conference, I volunteered to present in the first session, and today I would like to share those slides with Bitstreams readers:
I’ve included my script in the notes for each slide, so you can get the full context. There are also links to other talks the MSI team has done over the years and other work in enhanced imaging from my colleagues.
Just before Duke stopped travel for all faculty and staff last week, I was able to attend what will probably turn out to have been one of the last conferences of the spring in the Research Data Access and Preservation Association’s (RDAP) annual summit in Santa Fe, New Mexico. RDAP is a community of “data managers and curators, librarians, archivists, researchers, educators, students, technologists, and data scientists from academic institutions, data centers, funding agencies, and industry who represent a wide range of STEM disciplines, social sciences, and humanities,” and who are committed to creating, maintaining, and teaching best practices for the access and preservation of research data. While there were many interesting presentations and posters about the work being done in this area at various institutions around the country, the conference and RDAP’s work more broadly resonated with me in a very general and timely way, which did not necessarily stem from anything I heard during the week.
In a situation like the global pandemic we are now facing, open and unfettered access to research data is vital for treating patients, attempting to stem the course of the disease, and potentially developing life-saving vaccines lives.
A recent editorial in Science, Translational Medicine, argues that data-driven models and centralized data sharing are the best way to approach this kind of outbreak, stating “[w]e believe that scientific efforts need to include determining the values (and ranges) of the above key variables and identifying any other important ones. In addition, information on these variables should be shared freely among the scientific and the response and resilience communities, such as the Red Cross, other nongovernmental organizations, and emergency responders” . As another article points out, sharing viral samples from around the world has allowed scientists to get a better picture of the disease’s genetic makeup: “[c]omparing those genomes allowed Bedford and colleagues to piece together a viral family tree. ‘We can chart this out on the map, then, because we know that this genome is connected to this genome by these mutations,’ he said. ‘And we can learn about these transmission links'” .
We can chart this out on the map, then, because we know that this genome is connected to this genome by these mutations. And we can learn about these transmission links.
Scientists are also accelerating the research lifecycle by using preprint servers like arXiv, bioRxiv, and medRxiv to share their preliminary conclusions without waiting on the often glacial process of peer review. This isn’t a wholly unalloyed positive, and many preprints warrant the increased scrutiny that peer review represents. Moreover, scientific research often benefits from the kind of contextualization and unpacking that peer review and science journalism can occasionally provide. But in the acute crisis that the current outbreak presents, the rapid spread of information among scientific peer networks can undoubtedly save lives.
Continuing to develop and build the infrastructure—in terms of both technology and policy frameworks—needed to conduct the kind of data sharing we are seeing now remains a goal for the scientific community moving forward.
The Libraries, along with communities like RDAP, the Research Data Alliance, and the Data Curation Network, endorse and support this mission, and we will continue to play our role in preserving and providing persistent access to research data as best we can as we all move forward through this together. In the meantime, we hope everyone in the Duke community stays safe and healthy!
 Layne, S. P., Hyman, J. M., Morens, D. M., & Taubenberger, J. K. (2020, March 11). New coronavirus outbreak: Framing questions for pandemic prevention. Science Translational Medicine 12(534). https://doi.org/10.1126/scitranslmed.abb1469
 Sanders, L. (2020, February 13). Coronavirus’s genetic fingerprints are used to rapidly map its spread. Science News. https://www.sciencenews.org/article/coronavirus-genetic-fingerprints-are-used-to-rapidly-map-spread
Back in August I wrote a Bitstreams post about the various ways by which those of us who work with library metadata could attempt to tackle the issue of problematic descriptions and descriptive standards. One of the methods I mentioned was activism, and I highlighted the documentary ‘Change the Subject!’, which follows the story of students and librarians at Dartmouth University as they worked together to lobby the Library of Congress to stop using the term ‘illegal aliens’ to describe undocumented immigrants.
Recently, the Triangle Research Libraries’ Network offered a screening of this documentary to its constituent libraries, who were treated to a special viewing (and free popcorn!) at Durham’s iconic Carolina Theater. I attended this screening and participated in a panel discussion following the film.
I found the documentary to be both encouraging and disheartening: encouraging, as the student activists’ vision, fortitude, and perseverance is inspiring, but disheartening as ultimately, their campaign to have the term ‘illegal aliens’ removed from the Library of Congress Subject Headings failed, due to intervention from Congress.
However, the panel discussion following the screening restored some of my faith that we could still manage problematic metadata with the tools at our disposal. Some of the ideas that were mentioned included:
Identifying alternative thesauri and vocabularies that better represent diversity, equity, and inclusion, and being proactive in mapping problematic metadata to preferred terms.
Working with library vendors to communicate that this is an issue we care about, and perhaps suggesting the use of more inclusive language in their products.
Working with students and student activist groups to collaborate on identifying and remediating areas for improvement in our descriptive practices (as well as library work and spaces in general).
Continuing to use SACO funnels – formal channels for submitting subject authority records to the Library of Congress – while recognizing that this is time consuming yet important work.
And, of course, we can use the technological solution we have already developed for suppressing problematic subject headings from the shared TRLN discovery layer (eg, Duke, UNC, and NCSU’s catalog). Work has progressed on developing policies and governance to support workflows for implementing this solution, including the formation of a TRLN Discovery Metadata Team, which will focus on the shared discovery layer, and a more broadly focused TRLN Metadata Interest Group. Stay tuned!
One of the highlights of the Association of Moving Image Archivists’ annual conference is “Archival Screening Night,” where members of the AMIA community showcase recently-discovered and newly-restored film and video footage. The event usually takes place in a historic movie theater, with skilled projectionists that are able to present the film materials in their original format, on the big screen. At the most recent AMIA conference, in Portland, Oregon, there was a wide array of impressive material presented, but one film in particular left the audience speechless, and is a wonderful example of how archivists can unearth treasures that can alter our perspective on human history, and humanity itself.
The film, “Something Good – Negro Kiss” was made in 1898. It’s silent, black & white, and is less than a minute long. But it’s groundbreaking, in that it shows the earliest known depiction of an African-American couple kissing, and stands in opposition to the racist, minstrel-show portrayals of black people so common in the early days of American filmmaking. The couple embrace, kiss, and sway back and forth in a playful, spontaneous dance that comes across as genuine and heartwarming. Although it may not have been intentional, the short film seems to be free of negative racial stereotypes. You can watch it here:
The film is likely an homage to “The Kiss” (also known as the May Irwin Kiss), a film made in 1896, with a white couple kissing. It was one of the first films ever shown commercially, and is the very first kiss on film. Even though the couple was white, and the kissing is remarkably tame by today’s standards, it created a lot of controversy at the time, because kissing in public was prohibited by law. The Catholic church and newspaper editorials denounced “The Kiss” and called for censorship and prosecution. Although there is no documented history yet about the public reaction to “Something Good – Negro Kiss,” one can only imagine the shock and scandal it must have caused, showing an African-American couple kissing each other, only two years later.
On learning that this year’s conference on Open Repositories would be held in Bozeman, Montana, I was initially perplexed. What an odd, out-of-the-way corner of the world in which to hold an international conference on the work of institutional digital repositories. After touching down in Montana, however, it quickly became apparent how appropriate the setting would be to this year’s conference—a geographic metaphor for the conference theme of openness and sustainability. I grew up out west, but coastal California has nothing on the incomprehensibly vast and panoramic expanse of western Montana. I was fortunate enough to pass a few days driving around the state before the conference began, culminating in a long afternoon spent at Yellowstone National Park. As we wrapped up our hike that afternoon by navigating the crowds and the boardwalks hovering over the terraces of the Mammoth Hot Springs, I wondered about the toll our presence took on the park, what responsible consumption of the landscape looks like, and how we might best preserve the park’s beauty for the future.
Tuesday’s opening remarks from Kenning Arlitsch, conference host Montana State University’s Dean of Libraries, reflected these concerns, pivoting from a few words on what “open” means for library and information professionals to a lengthier consideration of the impact of “openness” on the uniqueness and precarity of the greater Yellowstone eco-system. Dr. Arlitsch noted that “[w]e can always create more digital space, but we cannot create more of these wild spaces.” While I agree unreservedly with the latter part of his statement, as the conference progressed, I found myself re-evaluating the whole of that assertion. Although it’s true that we may be able to create more digital space with some ease (particularly as the strict monetary cost of digital storage becomes more manageable), it’s what we do with this space that is meaningful for the future. One of my chief takeaways from my time in Montana was that responsibly stewarding our digital commons and sustaining open knowledge for the long term is hard, complicated work. As the volume of ever more complex digital assets accelerates, finding ways responsibly ensure access now and for the future is increasingly difficult.
“Research and Cultural Heritage communities have embraced the idea of Open; open communities, open source software, open data, scholarly communications, and open access publications and collections. These projects and communities require different modes of thinking and resourcing than purchasing vended products. While open may be the way forward, mitigating fatigue, finding sustainable funding, and building flexible digital repository platforms is something most of us are striving for.”
Many of the sessions I attended took the curation of research data in institutional repositories as their focus; in particular, a Monday workshop on “Engaging Liaison Librarians in the Data Deposit Workflow: Starting the Conversation” highlighted that research data curation is taking place through a wide array of variously resourced and staffed workflows across institutions. A good number of institutions do not have their own local repository for data, and even those larger organizations with broad data curation expertise and robust curatorial workflows (like Carnegie Mellon University, representatives from which led the workshop) may outsource their data publishing infrastructure to applications like Figshare, rather than build a local solution. Curatorial tasks tended to mean different things in different organizational contexts, and workflows varied according to staffing capacity. Our workshop breakout group spent some time debating the question of whether institutional repositories should even be in the business of research data curation, given the demanding nature of the work and the disparity in available resources among research organizations. It’s a tough question without any easy answers; while there are some good reasons for institutions to engage in this kind of work where they are able (maintaining local ownership of open data, institutional branding for researchers), it’s hard to escape the conclusion that many IRs are under-equipped from the standpoint of staff or infrastructure to sustainably process the on-coming wave of large-scale research data.
Elsewhere, from a technical perspective, presentations chiefly seemed to emphasize modularity, microservices, and avoiding reinventing the wheel. Going forward, it seems as though community development and shared solutions to problems held in common will be integral strategies to sustainably preserving our institutional research output and digital cultural heritage. The challenge resides in equitably distributing this work and in providing appropriate infrastructure to support maintenance and governance of the systems preserving and providing access to our data.
Last week I had the opportunity to attend the 52nd Association for Recorded Sound Collections Annual Conference in Baltimore, MD. From the ARSC website:
Founded in 1966, the Association for Recorded Sound Collections, Inc. is a nonprofit organization dedicated to the preservation and study of sound recordings—in all genres of music and speech, in all formats, and from all periods.
ARSC is unique in bringing together private individuals and institutional professionals. Archivists, librarians, and curators representing many of the world’s leading audiovisual repositories participate in ARSC alongside record collectors, record dealers, researchers, historians, discographers, musicians, engineers, producers, reviewers, and broadcasters.
ARSC’s vitality springs from more than 1000 knowledgeable, passionate, helpful members who really care about sound recordings.
ARSC Annual Conferences encourage open sharing of knowledge through informative presentations, workshops, and panel discussions. Tours, receptions, and special local events heighten the camaraderie that makes ARSC conferences lively and enjoyable.
This quote highlights several of the things that have made ARSC resources valuable and educational to me as the Audio Production Specialist at Duke Libraries:
The group’s membership includes both professionals and enthusiasts from a variety of backgrounds and types of institutions.
Members’ interests and specialties span a broad array of musical genres, media types, and time periods.
The organization serves as a repository of knowledge on obscure and obsolete sound recording media and technology.
This year’s conference offered a number of presentations that were directly relevant to our work here in Digital Collections and Curation Services, highlighting audio collections that have been digitized and the challenges encountered along the way. Here’s a quick recap of some that stood out to me:
“Uncovering the Indian Neck Folk Festival Collection” by Maya Lerman (Folklife Center, Library of Congress). This presentation showcased a collection of recordings and related documentation from a small invitation-only folk festival that ran from 1961-2014 and included early performances from Reverend Gary Davis, Dave Van Ronk, and Bob Dylan. It touched on some of the difficulties in archiving optical and born-digital media (lack of metadata, deterioration of CD-Rs) as well as the benefits of educating prospective donors on best practices for media and documentation.
“A Garage in South Philly: The Vernacular Music Research Archive of Thornton Hagert” by David Sager and Anne Stanfield-Hagert. This presentation paid tribute to the massive jazz archive of the late Mr. Hagert, comprising over 125,000 items of printed music, 75,000 recordings, 5,500 books, and 2,000 periodicals. It spoke to the difficulties of selling or donating a private collection of this magnitude without splitting it up and undoing the careful, but idiosyncratic organizational structure as envisioned by the collector.
“Freedom is a Constant Struggle: The Golden State Mutual Sound Recordings” by Kelly Besser, Yasmin Dessem and Shanni Miller (UCLA Library). This presentation covered the audio material from the archive of an African American-owned insurance company founded in 1925 in the Bay Area. While audio was only a small part of this larger collection, the speakers demonstrated how it added additional context and depth to photographs, video, and written documents. They also showed how this kind of archival audio can be an important tool in telling the stories of previously suppressed or unheard voices.
“Sounds, Sights and Sites of Activism in ’68” by Guha Shankar (Library of Congress). This presentation examined a collection of recordings from “Resurrection City” in Washington, DC. This was an encampment that was part of the Poor People’s Campaign, a demonstration for human rights organized by Martin Luther King, Jr. prior to his assassination in 1968. The talk showed how these archival documents are being accessed and used to inform new forms of social and political activism and wider circulation via podcasts, websites, public lecture and exhibitions.
The ARSC Conference also touched on my personal interests in American traditional and vernacular music, especially folk and blues from the early 20th Century. Presentations on the bluegrass scene in Baltimore, blues guitarist Johnny Shines, education outreach by the creators of PBS’s “American Epic” documentaries, and Hickory, NC’s own Blue Sky Boys provided a welcome break from favorite archivist topics such as metadata, workflows, and quality control. Other fun parts of the conference included an impromptu jam session, a silent auction of books & records, and posters documenting the musical history of Baltimore. True to the city’s nickname, I was charmed by my time in Baltimore and inspired by the amazingly diverse and dedicated work towards collecting and preserving our audio heritage by the ARSC community.
This past year brought renewed focus on AV development, as we worked to bring the NEH grant-funded Radio Haiti Archive online (launched in June). At the same time, our digital collections legacy platform migration efforts shifted toward moving our existing high-profile digital AV material into the repository.
At Duke University Libraries, we take accessibility seriously. We aim to include captions or transcripts for the audiovisual objects made available via the Duke Digital Repository, especially to ensure that the materials can be perceived and navigated by people with disabilities. For instance, work is well underway to create closed captions for all 1,400 items in the Duke Chapel Recordings project.
The DDR now accommodates modeling and ingest for caption files, and our AV player interface (powered by JW Player) presents a CC button whenever a caption file is available. Caption files are encoded using WebVTT, the modern W3C standard for associating timed text with HTML audio and video. WebVTT is structured so as to be machine-processable, while remaining lightweight enough to be reasonably read, created, or edited by a person. It’s a format that transcription vendors can provide. And given its endorsement by W3C, it should be a viable captioning format for a wide range of applications and devices for the foreseeable future.
Displaying captions within the player UI is helpful, but it only gets us so far. For one, that doesn’t give a user a way to just read the caption text without requiring them to play the media. We also need to support captions for audio files, but unlike with video, the audio player doesn’t include enough real estate within itself to render the captions. There’s no room for them to appear.
We also do some extra formatting when the WebVTT cues include voice tags (<v> tags), which can optionally indicate the name of the speaker (e.g., <v Jane Smith>). The in-page transcript is indexed by Google for search retrieval.
In many cases, especially for audio items, we may have only a PDF or other type of document with a transcript of a recording that isn’t structured or time-coded. Like captions, these documents are important for accessibility. We have developed support for displaying links to these documents near the media player. Look for some new collections using this feature to become available in early 2018.
The DDR web interface provides an optimal viewing or listening experience for AV, but we also want to make it easy to present objects from the DDR on other websites, too. When used on other sites, we’d like the objects to include some metadata, a link to the DDR page, and proper attribution. To that end, we now have copyable <iframe> embed code available from the Share menu for AV items.
This embed code is also what we now use within the Rubenstein Library collection guides (finding aids) interface: it lets us present digital objects from the DDR directly from within a corresponding collection guide. So as a researcher browses the inventory of a physical archival collection, they can play the media inline without having to leave.
If your website or blog is one of the thousands of WordPress sites hosted and supported by Sites@Duke — a service of Duke’s Office of Information Technology (OIT) — we have good news for you. You can now embed objects from the DDR using WordPress shortcode. Sites@Duke, like many content management systems, doesn’t allow authors to enter <iframe> tags, so shortcode is the only way to get embeddable media to render.
Here are the other AV-related features we have been able to develop in 2017:
Access control: master files & derivatives alike can be protected so access is limited to only authorized users/groups
Video thumbnail images: model, manage, and display
Video poster frames: model, manage, and display
Intermediate/mezzanine files: model and manage
Rights display: display icons and info from RightsStatements.org and Creative Commons, so it’s clear what users are permitted to do with media.
We look forward to sharing our recent AV development with our peers at the upcoming Samvera Connect conference (Nov 6-9, 2017 in Evanston, IL). Here’s our poster summarizing the work to date:
Looking ahead to the next couple months, we aim to round out the year by completing a few more AV-related features, most notably:
Export WebVTT captions as PDF or .txt
Advance the player via linked timecodes in the description field in an item’s metadata
Improve workflows for uploading caption files and transcript documents
Now that these features are in place, we’ll be sharing a bunch of great new AV collections soon!
In June of this year I was fortunate to have participated in the inaugural TRLN Institute. Modeled as a sort of Scholarly Communication Institute for TRLN (Triangle Research Libraries Network, a consortium located in the Triangle region of North Carolina), the Institute provided space (the magnificent Hunt Library on North Carolina State University’s campus), time (three full days), and food (Breakfast! Lunch! Coffee!) for groups of 4-6 people from member libraries to get together to exclusively focus on developing innovative solutions to shared problems. Not only was it productive, it was truly delightful to spend time with colleagues from member institutions who, although we are geographically close, don’t get together often enough.
Six projects were chosen from a pool of applicants who proposed topics around this year’s theme of Scholarly Communication:
Supporting Scholarly Communications in Libraries through Project Management Best Practices
Locating Research Data in an Age of Open Access
Clarifying Rights and Maximizing Reuse with RightsStatements.org
Building a Research Data Community of Practice in NC
Building the 21st Century Researcher Brand
Scholarship in the Sandbox: Showcasing Student Works
You can read descriptions of the projects as well as group membership here.
Having this much dedicated and unencumbered time to thoughtfully and intentionally address a problem area with colleagues was invaluable. And the open schedule allowed groups to be flexible as their ideas and expectations changed throughout the course of the three-day program. My own group – Clarifying Rights and Maximizing Reuse with RightsStatements.org – was originally focused on developing practices for the application and representation of RightsStatements.org statements for TRLN libraries’ online digitized collections. Through talking as a group, however, we realized early on that some of the stickiest issues regarding the implementation of a new rights management strategy involves the work an institution has to do to identify appropriate staff to do the work, allocate resources, plan, and document the process.
So, we pivoted! Instead of developing a decision matrix for applying the RS.org statements in digital collections (which is what we originally thought our output would be), we instead spent our time drafting a report – a roadmap of sorts – that describes the following important components when implementing RightsStatements.org:
roles and responsibilities (including questions that a person in a role would need to ask)
necessary planning and documentation
example implementations (including steps taken and staff involved – perhaps the most useful section of the report)
I’d say that the first TRLN Institute was a success. I can’t imagine my group having self-organized and produced a document in just over a month without having first had three days to work together in the same space and unencumbered by other responsibilities. I think other groups have found valuable traction via the Institute as well, which will result in more collaborative efforts. I look forward to seeing what future TRLN Institute produce – this is definitely a model to continue!
I always appreciate the bird’s-eye view of the work I do gained by attending national conferences, and often come away with novel ideas on how to solve old problems and colleagues to reach out to when I encounter new ones. So I was anticipating as much when I left home for the airport last week to attend the 2016 DLF Forum in Milwaukee, Wisconsin.
And that is indeed the experience I had, but in addition to gaining new ideas and new friends, the keynote for the conference challenged me to think deeply about the broader context in which we as librarians and information professionals do our work, who we do that work for, and whether or not we are living up to the values of inclusivity and accessibility that I hold dear.
The keynote speaker was Stacie Williams, a librarian/archivist who talked about the politics of labor in our communities, both specific to libraries and archives and beyond. She posited that all labor is local, and focused on the caregiving work that, so often performed by women and minorities in under- or unpaid positions, is the “beam underpinning this entire system of labor as we know it … and yet it remains the most invisible part of what makes our economy run”. In order to value ourselves and our work, we need to value all of the labor upon which our society is based. And she asked us to think of the work we do as librarians and archivists – the services we provide – as a form of caregiving to our own communities. She also posited that the information work in which we are engaged has followed the late capitalism trend of an anti-care ethos, and implored us to examine our own institutional practices, asking questions such as:
Do we engage in digitization projects where work was performed by at-will workers with no benefits or unpaid interns, or outsourced to prison workers?
Are we physically situated on university campuses that are inaccessible to our local community, either by way of location or prohibitive expense?
Have we undergone extreme cuts to our workforces, hindering our ability to provide services?
Do our hiring practices replicate systems that reward racial/gender class standards?
Do we build positions into grants that don’t pay living wages?
Williams asked us to interrogate the ways in which our labor practices are problematic and to center library work in the care ethics necessary “to reflect the standards of access and equality that we say we hold in this profession”. As a metadata specialist who spends a large chunk of her time working to create description for cultural heritage materials, this statement was especially resonant: “Few things are more liberatory than being able to tell your own story and history and have control and stewardship over your cultural narrative”. This is a tension I am especially aware of – describing resources for discovery and access in ways that honor and reflect the voices and self-identity of original creators or subjects.
The following days were of course filled with interesting and useful panels and presentations, working lunches, and meetups with new and old colleagues. But the keynote, along with the context of the national election, infused the rest of the conference with a spirit of thoughtfulness and openness toward engaging in a deep exploration of our labor practices and relationships to the communities of people we serve. It has given me a lot to think about, and I’m grateful to the DLF Forum planners for bringing this level of discourse to the annual conference.
Notes from the Duke University Libraries Digital Projects Team