Category Archives: Uncategorized

Curating for a community: joining the DCN

At DUL, we talk quite a lot about the value of research data curation. The Libraries provide a curatorial review of all data packages submitted to the Research Data Repository for publication. This review can help to enhance a researcher’s dataset by enabling a second or third pair of eyes to look over the data and ensure that all documentation is as complete as possible and that the dataset as a whole has been optimized for long term reuse. Although it’s not necessary to have expertise in the domain of the data under review, it can be helpful to give the curator a fuller picture of what is needed to help make those data FAIR. While data curators working in the Libraries possess a wealth of knowledge about general research data-related best practices, and are especially well-versed in the vagaries of social sciences data, they may not always have the all the information they need to sufficiently assess the state of a dataset from a researcher.

As I discussed in a blog post back in 2019, for the last few years, Duke has been a part of a project designed to address gaps in domain proficiency that are a natural part of a curation program of our size. The Data Curation Network has functioned as grant-supported consortium of data curation professionals located in research institutions who have pooled their knowledge to provide enhanced review for data that fall outside the expertise of local curators. Partner institutions can submit datasets to the Network and they will be matched with a DCN curator with the relevant domain experience. Beyond providing curation services, the DCN generates a variety of community resources pertaining to data curation, including a standardized set of curation steps and workflow, a list of essential data curation activities, and a growing roster of instructional primers to support the curation of various kinds of data.

The DCN has grown since my last post, and now includes curators from 11 institutions and the Dryad research data repository. DCN curators work with data from disciplines ranging from aerospace engineering to urban and regional planning and tackle data types from qualitative survey responses to machine learning model training datasets.

Updated for 2021!

Although two members have worked with the DCN for a few years, the rest of the DUL research data curation team is now getting in on the action. Last week, the two Repository Services Analysts embedded with the curation team began the process of onboarding to serve as DCN curators. While we have been able to contribute to local curation of datasets for the RDR, this new opportunity presents us with a chance to not only gain valuable experience working with some practiced curators, but also to contribute back to the community that has helped to support our work. We are very excited to expand and deepen our DCN participation!

Indexing variant names from the Library of Congress Name Authority File (LCNAF) in TRLN Discovery

You might or might not have noticed a TRLN Discovery feature announcement in the February TRLN News Roundup. It mentioned that we are now indexing variant names from the Library of Congress Name Authority File in TRLN Discovery. I thought in this post I would expand on what this change means for Duke’s Books & Media catalog, add some details about the technical implementation, and discuss some related features we might add in the future based on this work.

What is it?

First, the practical matter:  what does this feature mean for people who search the catalog? Our catalog records contain authoritative forms of creator names. This is the specific form of the person’s name chosen as the authoritative form by the Library of Congress. For example, the authoritative form of Emily Dickinson’s name is “Dickinson, Emily, 1830-1886.” If you search the Books & Media catalog using this form of the poet’s name you will find all records associated with her name (example search with the authoritative name). Previously, if you had searched the catalog and added the poet’s middle name, “Elizabeth,” it’s likely you would have missed many relevant results because “Elizabeth” is not included in the authoritative form of the name. It is, however, included in one of the variant names in the LC Name Authority File. The full list of variant names for Emily Dickinson is:

  • Dickinson, Emilia, 1830-1886
  • Dickinson, Emily Elizabeth, 1830-1886
  • Dickinson, Emily (Emily Elizabeth), 1830-1886
  • Dikinson, Ėmili, 1830-1886
  • D̲ikinson, Emily, 1830-1886
  • Ti-chin-sen, Ai-mi-li, 1830-1886
  • דיקינסון, אמילי, 1830־1886
  • דיקינסון, אמילי, 1886־1830
  • Dykinsan, Ėmili, 1830-1886

Emily Dickinson

Since we are now indexing these forms in TRLN Discovery you now get much better results if you happen to add Emily Dickinson’s middle name to your search (example search including a variant form of the name). Additionally, various romanizations and vernacular forms are indexed (example search for “דיקינסון, אמילי”).

If you clicked through to the example searches you may have noticed that the result counts and result order are slightly different when searching the authoritative form vs. the variant forms.

The variant forms are only indexed on records that include a URI that references the LC Name Authority File. If this URI reference is missing the variant names are not indexed for that record. Additionally, some records may not have been updated since we implemented this feature. In time all records that include URIs for names will have variant names indexed.

The difference in result order is due to how the variant names are indexed. For the authoritative form of the name we distinguish between creators, editors, contributors, etc. and give matches in these categories different boosts in the relevance ranking. At the moment, the variant names from the LCNAF file are indexed in a single field and so we lose the nuance needed for more granular relevance ranking. This is something that could be revised in the future if needed.

How does it work?

This feature relies on the fact that our MARC records include URI references to the LC Name Authority File. As an example, here’s a MARC XML 100 Main Entry-Personal Name field for Emily Dickinson with a URI reference to the authority file.

<datafield tag="100" ind1="1" ind2=" ">
<subfield code="a">Dickinson, Emily,</subfield>
<subfield code="d">1830-1886.</subfield>
<subfield code="0">http://id.loc.gov/authorities/names/n79054166</subfield>
</datafield>

We store this URI reference in the TRLN Discovery name field and then use this URI reference at ingest time to lookup and index the variant names from a local cache of the variant names. Here’s the stored name for Emily Dickinson in TRLN Discovery index.

names_a: ["{\"name\":\"Dickinson, Emily, 1830-1886\",\"rel\":\"author\",\"type\":\"creator\",\"id\":\"http://id.loc.gov/authorities/names/n79054166\"}"]

The TRLN Discovery ingest service keeps its own cache of the name identifiers and variant names for efficient lookup at ingest time. We use Redis, an open-source, in-memory (very fast) data store to make the variant names available when records are ingested. This local cache is built from the LC Name Authority File. Since the name authority file changes over time we will refresh our local cache of the data every 3 months to keep it up to date. We’ve written a script (Rails Rake task) that automates this update process.

What’s next?

The addition of stored name authority URIs in the TRLN Discovery index opens up opportunities to add more features in the future. I’m especially interested in displaying more contextual information about creators in our catalog. We could also expose “See also” references from the authority files to make it easier to find works by the same person published under different names (“Twain, Mark, 1835-1910” being a good example):

  • Clemens, Samuel Langhorne, 1835-1910
  • Conte, Louis de, 1835-1910
  • Snodgrass, Quintus Curtius, 1835-1910

Mark Twain

As always, we continue to add features and make incremental improvements to TRLN Discovery, and your feedback is critical. Please let us know how things are working for you using the feedback form available on every page of the Books & Media Catalog.

What does it mean to be an actively antiracist developer?

The library has been committed to Diversity, Equity, and Inclusion for the past year extended, specifically through the work of DivE-In and the Anti-Racist Roadmap. And to that end, the Digital Strategies and Technology department, where I work, has also been focusing on these issues. So lately I’ve been thinking a lot about how, as a web developer, I can be actively antiracist in my work.

First, some context. As a cis-gendered white male who is gainfully employed and resides in one of the best places to live in the country, I am soaking in privilege. So take everything I have to say with that large grain of salt. My first job out of college was working at a tech startup that was founded and run by a black person. To my memory, the overall makeup of the staff was something like 40–50% BIPOC, so my introduction to the professional IT world was that it was normal to see people who were different than me. However, in subsequent jobs my coworker pool has been much less diverse and more representative of the industry in general, which is to say very white and very male, which I think is a problem. So how can an industry that lacks diversity actively work on promoting the importance of diversity? How can we push back against systematic racism and oppression when we benefit from those very systems? I don’t think there are any easy answers.

Antiracist Baby Cover
Antiracist Baby by Ibram X. Kendi

I think it’s important to recognize that for organizations driven by top-down decision making, sweeping change needs to come from above. To quote one of my favorite bedtime stories, “Point at policies as the problem, not people. There’s nothing wrong with the people!” But that doesn’t excuse ‘the people’ from doing the hard work that can lead to profound change. I believe an important first step is to acknowledge your own implicit bias (if you are able, attend Duke IT’s Implicit Bias in the Workplace Training). Confronting these issues is an uncomfortable process, but I think ultimately that’s a good thing. And at least for me, I think doing this work is an ongoing process. I don’t think my implicit biases will ever truly go away, so it’s up to me to constantly be on the lookout for them and to broaden my horizons and experiences.

So in addition to working on our internalized biases, I think we can also work on how we communicate with each other as coworkers. In a recent DST-wide meeting concerning racial equity at DUL, the group I was in talked a lot about interpersonal communication. We should recognize that we all have blind spots and patterns that we slip into, like being overly jargony, being terse and/or confrontational, and so on. We have the power to change these patterns. I think we also need to be thoughtful of the language we use and the words that we speak. We need to appreciate diversity of backgrounds and be mindful of the mental taxation of code switching. We can try to help each other feel more comfortable in own skin and feel safe expressing our thoughts and ideas. I think it’s profoundly important to meet people from a place of empathy and mutual respect. And we should not pass up the opportunities to have difficult conversations with each other. If I say something loaded with a microaggression and make a colleague feel uncomfortable or slighted, I want to be called out. I want to learn from my mistakes, and I would think that’s true for all of my coworkers.

aze-con
Axe-con is an open and inclusive digital accessibility conference

We can also incorporate anti-racist practices into the things we create. Throughout my career, I’ve tried to always promote the benefits of building accessible interfaces that follow the practices of universal design. Building things with accessibility in mind is good for everyone, not just those who make use of assistive technologies. And as an aside, axe-con 2021 was packed full of great presentations, and recording are available for free. We can take small steps like removing problematic language from our workflows (“master” branches are now “main”). But I think and hope we can do more. Some areas where I think we have an opportunity to be more proactive would be doing an assessment of our projects and tools to see to what degree (if at all) we seek out feedback and input from BIPOC staff and patrons. How can we make sure their voices are represented in what we create?

I don’t have many good answers, but I will keep listening, and learning, and growing.

An Intern’s Investigation on Decolonizing Archival Descriptions and Legacy Metadata

This post was written by Laurier Cress. Laurier Cress is a graduate student at the University of Denver studying Library Science with an emphasis on digital collections, rare books and manuscripts, and social justice in librarianship and archives. In addition to LIS topics, she is also interested in Medieval and Early Modern European History. Laurier worked as a practicum intern with the Digital Collections and Curation Services Department this winter to investigate auditing practices for decolonizing archival descriptions and metadata. Laurier will complete her masters degree in the Fall of 2021. In her spare time, she also runs a YouTube channel called, Old Dirty History, where she discusses historic events, people, and places throughout history.

Now that diversity, equity, and inclusion (DEI) are popular concerns for libraries throughout the United States, discussions on DEI are inescapable. These three words have become reoccurring buzzwords dropped in meetings, classroom lectures, class syllabi, presentations, and workshops across the LIS landscape. While in some contexts, topics in DEI are thrown around with no sincere intent or value behind them, some institutions are taking steps to give meaning to DEI in librarianship. As an African American MLIS student at the University of Denver, I can say I have listened to one too many superficial talks on why DEI is important in our field. These conversations customarily exclude any examples on what DEI work actually looks like. When Duke Libraries advertised a practicum opportunity devoted to hands on experience exploring auditing practices for legacy metadata and harmful archival descriptions, I was immediately sold. I saw this experience as an opportunity to learn what scholars in our field are actually doing to make libraries a more equitable and diverse place.

As a practicum intern in Duke Libraries’ Digital Collections and Curation Services (DCCS) department, I spent three months exploring frameworks for auditing legacy metadata against DEI values and investigating harmful language statements for the department. Part of this work also included applying what I learned to Duke’s collections. Duke’s digital collections boasts 131,169 items and 997 collections, across 1,000 years of history from all over the world. Many of the collections represent a diverse array of communities that contribute to the preservation of a variety of cultural identities. It is the responsibility of institutions with cultural heritage holdings to present, catalog, and preserve their collections in a manner that accurately and respectively portrays the communities depicted within them. However, many institutions housing cultural heritage collections use antiquated archival descriptions and legacy metadata that should be revisited to better reflect 21st century language and ideologies. It is my hope that this brief overview on decolonizing archival collections not only aids Duke, but other institutions as well.

Harmful Language Statement Investigation

During the first phase of my investigation, I conducted an analysis on harmful language statements across several educational institutions throughout the United States. This analysis served as a launchpad for investigating how Duke can improve upon their inclusive description statement for their digital collections. During my investigation, I created a list that comprises of 41 harmful language statements. Some of these institutions include:

  • The Walters Museum of Art
  • Princeton University
  • University of Denver
  • Stanford University
  • Yale University

After gathering a list of institutions with harmful language statements, the next phase of my investigation was to conduct a comparative analysis to uncover what they had in common and how they differed. For this analysis, 12 harmful language statements were selected at random from the total list. From this investigation, I created the Harmful Statement Research Log to record my findings. The research log comprises of two tabs. The first tab includes a list of harmful statements from 12 institutions, with supplemental comments and information about each statement. The second tab provides a list of 15 observations deduced from cross examining the 12 harmful language statements. Some observations made include placement, length, historical context, and Library of Congress Subject Heading (LCSH) disclaimers. It is important for me to note, while some of the information provided within the research log is based on pure observation, much of the report also includes conclusions based on personal opinions born from my own perspective as a user.

Decolonizing Archival Descriptions & Legacy Metadata

The next phase in my research was to investigate frameworks and current sentiments on decolonizing archival description and legacy metadata for Duke’s digital collections. Due to the limited amount of research on this subject, most of the information I came across was related to decolonizing collections describing Indigenous peoples in Canada and African American communities. I found that the influence of late 19th and early 20th centuries library classification systems can still be found within archival descriptions and metadata in contemporary library collections. The use of dated language within library and archival collections encourages the inequality of underrepresented groups through the promotion of discriminatory infrastructures established by these earlier classification systems. In many cases, offensive archival descriptions are sourced from donors and creators. While it is important for information institutions to preserve the historical context of records within their collections, descriptions written by creators should be contextualized to help users better understand the racial connotation surrounding the record. Issues regarding contextualizing racist ideologies from the past can be found throughout Duke’s digital collections.

During my investigation, I examined Duke’s MARC records from the collection level to locate examples of harmful language used within their descriptions. The first harmful archival description I encountered was from the Alfred Boyd Papers. The archival description describes a girl referenced within the papers as “a free mulatto girl”.  This is an example of when archival description should not shy away from the realities of racist language used during the period the collection was created in; however, context should be applied. “Mulatto” was an offensive term used during the era of slavery in the United States to refer to people of African and White European ancestry. It originates from the Spanish word “mulato”, and its literal meaning is “young mule”. While this word is used to describe the girl within the papers, it should not be used to describe the person within the archival description without historical context.Screenshot of metadata from the Alfred Boyd papers

When describing materials concerning marginalized peoples, it is important to preserve creator-sourced descriptions, while also contextualizing them. To accomplish this, there should be a defined distinction between descriptions from the creator and the institution’s archivists. Some institutions, like The Morgan Library and Museum, use quotation marks as part of their in-house archival description procedure to differentiate between language originating from collectors or dealers versus their archivists. It is important to preserve contextual information, when racism is at the core of the material being described, in order for users to better understand the collection’s historic significance. While this type of language can bring about feelings of discomfort, it is also important to not allow your desire for comfort to take precedence over conveying histories of oppression and power dynamics. Placing context over personal comfort also takes the form of describing relationships of power and acts of violence just as they are. Acts of racism, colonization, and white supremacy should be labeled as such. For example, Duke’s Stephen Duvall Doar Correspondence collection describes the act of “hiring” enslaved people during the Civil War. Slavery does not imply hired labor because hiring implies some form of compensation. Slavery can only equate to forced labor and should be described as such.

Several academic institutions have taken steps to decolonize their collections. At the beginning of my investigation, a mentor of mine referred me to the University of Alberta Library’s (UAL) Head of Metadata Strategies, Sharon Farnel. Farnel and her colleagues have done extensive work on decolonizing UAL’s holdings related to Indigenous communities. The university declared a call to action to protect the representation of Indigenous groups and to build relationships with other institutions and Indigenous communities. Although UAL’s call to action not only encompasses decolonizing their collections, for the sake of this article, I will solely focus on the framework they established to decolonize their archival descriptions.

Community Engagement is Not Optional

Farnel and her colleagues created a team called the Decolonizing Description Working Group (DDWG). Their purpose was to propose a plan of action on how descriptive metadata practices could more accurately and respectfully represent Indigenous peoples. The DDWG included a Metadata Coordinator, a Cataloguer, a Public Service Librarian, a Coordinator of Indigenous Initiatives, and a self-identified Indigenous MLIS Intern. Much of their work consisted of consulting with the community and collaborating with other institutions. When I reached out to Farnel, she was so kind and generous with sharing her experience as part of the DDWG. Farnel told me that the community engagement approach taken is dependent on the community. Marginalized peoples are not a monolith; therefore, there is no “one size fits all” solution. If you are going to consult community members, recognize the time and expertise the community provides. This relationship has to be mutually beneficial, with the community’s needs and requests at the forefront at all times.

For the DDWG, the best course of action was to start building a relationship with local Indigenous communities. Before engaging with the entire community, the team first engaged with community elders to learn how to proceed with consulting the community from a place of respect. Because the DDWG’s work took place prior to COVID-19, most meetings with the community took place in person. Farnel refers to these meetings as “knowledge gathering events”. Food and beverages were provided and a safe space for open conversation. A community elder would start the session to set the tone.

In addition to knowledge gathering events, Aboriginal and non-Aboriginal students and alumni were consulted through an informal short online survey. The survey was advertised through an informal social media posting. Once the participants confirmed the desire to partake in the survey, they received an email with a link to complete it. Participants were asked questions based on their feelings and reactions to potentially changing the Library of Congress Subject Headings (LCSH) that related to Aboriginal content.

Auditing Legacy Metadata and Archival Descriptions

There is more than one approach an institution can take to start auditing legacy metadata and descriptions. In a case study written by Dorothy Berry, who is currently the Digital Collections Program Manager at Harvard’s Houghton Library, she describes a digitization project that took place at the University of Minnesota Libraries. The purpose of the project was to not only digitize African American heritage materials within the university’s holdings, but to also explore ways mass digitization projects can help re-aggregate marginalized materials. This case study serves as an example of how collections can be audited for legacy metadata and archival descriptions during mass digitization projects. Granted, this specific project received funding to support such an undertaking and not all institutions have the amount of currency required to take on an initiative of this magnitude. However, this type of work can be done slowly over a longer period of time. Simply running a report to search for offensive terms such as “negro”, or in my case “mulatto”, is a good place to start. Be open to having discussions with staff to learn what offensive language they also have come across. Self-reflection and research are equally important. Princeton University Library’s inclusive description working group spent two years researching and gathering data on their collections before implementing any changes. Part of their auditing process also included using a XQuery script to locate harmful descriptions and recover histories that were marginalized due to lackluster description.

Creators Over Community = Problematic

While exploring Duke’s digital collections, one problem that stood out to me the most was the perpetual valorization of creators. This is often found in collections with creators who are white men. Adjectives like “renowned”, “genius’, “talented”, and “preeminent” are used to praise the creators and make the collection more about them instead of the community depicted within the collection. An example of this troublesome language can be found in Duke’s Sidney D. Gamble’s Photographs collection. This collection comprises of over 5,000 black and white photographs taken by Sidney D. Gamble during his four visits to China from 1908 to 1932. Content within the photographs encompass depictions of people, architecture, livestock, landscapes, and more. Very little emphasis is placed on the community represented within this collection. Little, if any, historical or cultural context is given to help educate users on the culture behind the collection. And the predominate language used here is English. However, there is a
full page of information on the life and exploits of Gamble.

Screenshot of a description of the Sidney Gamble digital collection.

Describing Communities

Harmful language used to describe individuals represented within digital collections can be found everywhere. This is not always intentional. Dorothy Berry’s presentation with the Sunshine State Digital Network on conscious editing serves as a great source of knowledge on problematic descriptions that can be easily overlooked. Some of Berry’s examples include:

  • Class: Examples include using descriptions such as “poor family” or “below the poverty line”.
  • Race & Ethnicity: Examples include using dehumanizing vocabulary to describe someone of a specific ethnicity or excluding describing someone of a specific race within an image.
  • Gender: Example includes referring to a woman using her husband’s full name (Mrs. John Doe) instead of her own.
  • Ability: Example includes using offensive language like “cripple” to describe disabled individuals.

This is only a handful of problematic description examples from Berry’s presentation. I highly recommend watching not only Berry’s presentation, but the entire Introduction to Conscious Editing Series.

Library of Congress Subject Headings (LCSH) Are Unavoidable

I could talk about LCSH in relation to decolonizing archival descriptions for days on end, but for the sake of wrapping up this post I won’t. In a perfect world we would stop using LCSH altogether. Unfortunately, this is impossible. Many institutions use custom made subject headings to promote their collections respectfully and appropriately. However, the problem with using custom made subject headings that are more culturally relevant and respectful is accessibility. If no one is using your custom-made subject headings when conducting a search, users and aggregators won’t find the information. This defeats the purpose of decolonizing archival collections, which is to make collections that represent marginalized communities more accessible.

What we can do is be as cognizant as possible of the LCSHs we are using and avoid harmful subject headings as much as possible. If you are uncertain if a LCSH is harmful, conduct research or consult with communities who desire to be part of your quest to remove harmful language from your collections. Let your users know why you are limited to subject headings that may be harmful and that you recognize the issue this presents to the communities you serve. Also consider collaborating with Cataloginglab.org to help design new LCSH proposals and to stay abreast on new LCSH that better reflect DEI values. There are also some alternative thesauri, like homosaurus.org and Xwi7xwa Subject Headings, that better describe underrepresented communities.

Resources

In support of Duke Libraries’ intent to decolonize their digital collections, I created a Google Drive folder that includes all the fantastic resources I included in my research on this subject. Some of these resources include metadata auditing practices from other institutions, recommendations on how to include communities in archival description, and frameworks for decolonizing their descriptions.

While this short overview provides a wealth of information gathered from many scholars, associations, and institutions who have worked hard to make libraries a better place for all people, I encourage anyone reading this to continue reading literature on this topic. This overview does not come close to covering half of what invested scholars and institutions have contributed to this work. I do hope it encourages librarians, catalogers, and metadata architects to take a closer look at their collections.

Access for One, Access for All: DPC’s Approach towards Folder Level Digitization

Earlier this year and prior to the pandemic, Digital Production Center (DPC) staff piloted an alternative approach to digitize patron requests with the Rubenstein Library’s Research Services (RLRS) team. The previous approach was focused on digitizing specific items that instruction librarians and patrons requested, and these items were delivered directly to that person. The alternative strategy, the Folder Level digitization approach, involves digitizing the contents of the entire folder that the item is contained in, ingesting these materials to the Duke Digital Repository (to enable Duke Library staff to retrieve these items), and when possible, publishing these materials so that they are available to anyone with internet access. This soft launch prepared us for what is now an all-hands-on-deck-but-in-a-socially-distant-manner digitization workflow.

Giao Luong Baker assessing folders in the DPC.

Since returning to campus for onsite digitization in late June, the DPC’s primary focus has been to perfect and ramp up this new workflow. It is important to note that the term “folder” in this case is more of a concept and that its contents and their conditions vary widely. Some folders may have 2 pages, other folders have over 300 pages. Some folders consists of pamphlets, notebooks, maps, papyri, and bound items. All this to say that a “folder” is a relatively loose term.

Like many initiatives at Duke Libraries, Folder Level Digitization is not just a DPC operation, it is a collaborative effort. This effort includes RLRS working with instructors and patrons to identify and retrieve the materials. RLRS also works with Rubenstein Library Technical Services (RLTS) to create starter digitization guides, which are the building blocks for our digitization guide. Lastly, RLRS vets the materials and determines their level of access. When necessary, Duke Library’s Conservation team steps in to prepare materials for digitization. After the materials are digitized, ingest and metadata work by the Digital Collections and Curation Services as well as the RLTS teams ensure that the materials are preserved and available in our systems.

Kristin Phelps captures a color target.

Doing this work in the midst of a pandemic requires that DPC work closely with the Rubenstein Library Access Services Reproduction Team (a section of RLRS) to track our workflow using a Google Doc. We track the point where the materials are identified by RLRS, through multiple quarantine periods, scanning, post processing, file delivery, to ingest. Also, DPC staff are digitizing in a manner that is consistent with COVID-19 guidelines. Materials are quarantined before and after they arrive at the DPC, machines and workspaces are cleaned before and after use, capture is done in separate rooms, and quality control is done off site with specialized calibrated monitors.

Since we started Folder Level digitization, the DPC has received close to 200 unique Instruction and Patron requests from RLRS. As of the publication of this post, 207 individual folders (an individual request may contain several folders) have been digitized. In total, we’ve scanned and quality controlled over 26,000 images since we returned to campus!

By digitizing entire folders, we hope this will allow for increased access to the materials without risking damage through their physical handling. So far we anticipate that 80 new digital collections will be ingested to the Duke Digital Repository. This number will only grow as we receive more requests. Folder Level Digitization is an exciting approach towards digital collection development, as it is directly responsive to instruction and researcher needs. With this approach, it is access for one, access for all!

How to Videos for Using Digital Collections

With so much remote instruction and research happening due to the current global pandemic, more and more folks are dependent on Duke Libraries Digital Collections.  How can all these potentially new digital researchers learn how to use our interfaces?  Thanks to my colleagues in the Rubenstein Libraries Research Services department, there are now 4 short, how-to videos available to help users understand how to navigate digital collections.

I’ve linked to the videos below.  In just 15 minutes one will hear an introduction to Duke’s Digital Collections, learn how to search within the interface, and use and cite digital items.

If you use Duke Digital Collections regularly, what other topics would you like to see covered in future videos or documentation?

Here’s What Happened Next: The Duke Digital Production Center in the Era of the COVID-19 Pandemic

On March 20, 2020, the Duke University Libraries were closed related to the COVID-19 pandemic.  Surrounded by a great deal of uncertainty as to when the Libraries would reopen, most library staff were sent home to work for the next months from home.  During this time, the Digital Production Center’s employees followed suit and, as part of that time away from the DPC, completed post-processing of images, image quality control, participated in project planning and wrote blogs on the closing of the Libraries, labor in the time of the coronavirus, and the history of videotelephony.  Following the end of the North Carolina Stay-at-Home order on April 29, discussions began in earnest about what the new reality would be for the Libraries.  It was determined that the DPC’s unique skill set was needed on site sooner rather than later, and so on June 26, we returned to Duke’s campus as “essential workers.”

Upon our return, we needed to make sure that our equipment was sanitized and in good working order.  Along with testing our scanner and cameras, we also recalibrated our monitors to ensure color accuracy and established our new workflow. 

It was determined that our efforts were most needed to prepare for Duke’s fall instruction materials.  With the uncertainty as to whether or not classes would be held in person or virtually, preparing digital materials to work with was prioritized.  So, we shifted from our normal project work to focus solely on digitization of these materials.  Each digitization specialist was asked to be onsite for 3 days a week to maximize use of our capture equipment.  The remaining two days of the week would be spent working from home to do quality control work on the images as well as various administrative tasks.  We had a plan; our remit was clear and we were working towards a goal.

On August 17, classes began for Duke University and our images began being used as part of instruction materials.  Duke University Library’s digitized images helped bridge the gap between the currently inaccessible library collections that Duke faculty and students normally rely on for coursework and the Fall 2020 students.

Thinking about the change in use and accessibility for collection materials leads to an interesting question:  With the lockdown which happened for most of the US, did digital collections receive more visits as people were restricted from leaving home and libraries were closed?  A quick glance at the Google Analytics for the Duke Digital Collections shows a 34% increase of unique page views from April 1-June 30 of this year as compared to the same time period in 2019.  While it is impossible to state definitively why the increase occurred, the pandemic is very likely a contributing factor.  Digital collections are arguably valuable assets for any institution which supports them.  They provide easy access to rarely seen or inaccessible materials and they have the potential to incite curiosity in the larger institutional holdings.  It is indeed interesting to consider what types of innovative scholarship and creative use of digital content may result from the pandemic’s “forced” use of digital collections over the next twelve months.

Of course, the rapid onset of the COVID-19 pandemic illuminated the need for alternative ways of operating.  At least temporarily, it has changed the way in which the Duke University Libraries are conducting business as usual these days.  And, in July of this year, Research Libraries UK published a document entitled “COVID19 and the Digital Shift in Action.”  This document reports on the effect of the pandemic on UK research libraries and suggests strategies for emphasis and support of the digital aspects of libraries as well as the need for change and flexibility within library collections.  Digital collections, e-books, e-textbooks, and digital content had their moment to shine during the pandemic and they have proven their value and importance.

And with the potential increased reliance on digitized material, many cultural heritage digitization specialists are now back on site in libraries, museums and archives, working to provide their expertise to add to existing digital collections.  Naturally, at the Duke Digital Production Center, we have been asked a number of times since our return if we are nervous about being back in our studio space.  Of course, we are, but we also recognize how our skills and contributions continue to create value for Duke University and Duke University Libraries.

Further reading:

Biswas, P., & Marchesoni, J. “Analyzing Digital Collections Entrances: What Gets Used and Why It Matters.” Information Technology and Libraries, v. 35, n. 4, p. 19-34, 30 December 2016.

Greenhall, M. “Covid-19 and the digital shift in action,” RLUK Report. 2020.  Can be accessed at:  https://www.rluk.ac.uk/wp-content/uploads/2020/06/Covid19-and-the-digital-shift-in-action-report-FINAL.pdf

Markin, Pablo.  “Pandemic Restrictions on Library Borrowing Showcase the Importance of Digital Collections and the Advantages of Open Access.” Open Research Community.  11 August 2020.  https://openresearch.community/posts/pandemic-restrictions-on-library-borrowing-showcase-the-importance-of-digital-collections-and-the-advantages-of-open-access

 

 

 

 

Sharing data and research in a time of global pandemic, Part 2

[Header image from Fischer, E., Fischer, M., Grass, D., Henrion, I., Warren, W., Westman, E. (2020, August 07). Low-cost measurement of facemask efficacy for filtering expelled droplets during speech. Science Advances. https://advances.sciencemag.org/content/early/2020/08/07/sciadv.abd3083]

Back in March, just as things were rapidly shutting down across the United States, I wrote a post reflecting on how integral the practice of sharing and preserving research data would be to any solution to the crisis posed by COVID-19. While some of the language in that post seems a bit naive in retrospect (particularly the bit about RDAP’s annual meeting being one of the last in-person conferences of just the spring, as opposed to the entire calendar year!), the emphasis on the importance of rapid and robust data sharing has stood the test of time. In late June, the Research Data Alliance released a set of recommendations and guidelines for sharing research data under circumstances shaped by COVID-19, and a number of organizations, including the National Institutes of Health, have established portals for finding data related to the disease. Access to data has been forefront in the minds of many researchers.

Perhaps in response to this general sentiment (or maybe because folks haven’t been able to access their labs?!), we in the Libraries have seen a notable increase in the number of submissions to our Research Data Repository for data publication. These datasets have derived from a broad range of disciplines, spanning Environmental Sciences to Dermatology. I wanted to use this blog post as an opportunity to highlight a few of our accessions from the last several months.

One of our most prolific sources of data deposits has historically been the lab of Dr. Patrick Charbonneau, associate professor of Chemistry and Physics. Dr. Charbonneau’s lab investigates glass and its physical properties and contributes to a project known as The Simons Collaboration on Cracking the Glass Problem, which addresses issues like disorder, nonlinear response and far-from-equilibrium dynamics. The most recent contribution from Dr. Charbonneau’s research group, published just last week, is fairly characteristic of the materials we receive from Dr. Charbonneau’s group. It contains the raw binary observational data and scripts that were used to create the figures which appear in the researcher’s article. Making these research products available helps other scholars to repeat or reproduce (and thereby strengthen) the findings elucidated in an associated research publication.

Fig01 / Fig02b, Data from: Finite-dimensional vestige of spinodal criticality above the dynamical glass transition

 

Another recent data deposit—a first of its kind for the RDR—is a Q-sort concourse for the Human Dimensions of Large Marine Protected Areas project, which investigates the formulation of large marine protected areas (defined by the project as “any ocean area larger than 100,000 km² that has been designated for the purpose of conservation”) as a global movement. Q-methodology is a psychology and social sciences research method used to study viewpoints. In this study, 40 interviewees were asked to evaluate statements related to large-scale marine protected areas. Q-sorts can be particularly helpful when researchers wish to describe subjective viewpoints related to an issue.

Q sort record sheet from: Q-Sort Concourse and Data for the Human Dimensions of Large MPAs project

Finally, perhaps our most timely deposit has come from a group investigating an alternate method to evaluate the efficacy of masks to reduce the transmission of respiratory droplets during regular speech. “Low-cost measurement of facemask efficacy for filtering expelled droplets during speech,” published last week in Science Advances, is a proof-of-concept study that proposes an optical measurement technique that the group asserts is both inexpensive and easy to use. Because the topic of measuring mask efficiency is still both complex and unsettled, the group hopes this work will help improve evaluation in order to guide mask selection and policy decisions.

Screenshot of Speaker1_None_05.mp4, Video data from: Low-cost measurement of facemask efficacy for filtering expelled droplets during speech

The dataset consists of a series of movie recordings, that capture an operator wearing a face mask and speaking in the direction of an expanded laser beam inside a dark enclosure. Droplets that propagate through the laser beam scatter light, which is then recorded with a cell phone camera. The group tested 12 kinds of masks (see below), and recorded 2 sets of controls with no masks. 

Figure 2 from Low-cost measurement of facemask efficacy for filtering expelled droplets during speech

We hope to keep up the momentum our data management, curation, and publication program has gained over the last few months, but we need your help! For more information on using the Duke Research Data Repository to share and preserve your data, please visit our website, or drop up a line at datamangement@duke.edu. A full list of the datasets we’ve published since moving to fully remote operations in March is available below.

  • Zhang, Y. (2020). Data from: Contributions of World Regions to the Global Tropospheric Ozone Burden Change from 1980 to 2010. Duke Research Data Repository. https://doi.org/10.7924/r40p13p11
  • Campbell, L. M., Gray, N., & Gruby, R. (2020). Data from: Q-Sort Concourse and Data for the Human Dimensions of Large MPAs project. Duke Research Data Repository. https://doi.org/10.7924/r4j38sg3b
  • Berthier, L., Charbonneau, P., & Kundu, J. (2020). Data from: Finite-dimensional vestige of spinodal criticality above the dynamical glass transition. Duke Research Data Repository. https://doi.org/10.7924/r4jh3m094
  • Fischer, E., Fischer, M., Grass, D., Henrion, I., Warren, W., Westman, E. (2020). Video data files from: Low-cost measurement of facemask efficacy for filtering expelled droplets during speech. Duke Research Data Repository. V2 https://doi.org/10.7924/r4ww7dx6q
  • Lin, Y., Kouznetsova, T., Chang, C., Craig, S. (2020). Data from: Enhanced polymer mechanical degradation through mechanochemically unveiled lactonization. Duke Research Data Repository. V2 https://doi.org/10.7924/r4fq9x365
  • Chavez, S. P., Silva, Y., & Barros, A. P. (2020). Data from: High-elevation monsoon precipitation processes in the Central Andes of Peru. Duke Research Data Repository. V2 https://doi.org/10.7924/r41n84j94
  • Jeuland, M., Ohlendorf, N., Saparapa, R., & Steckel, J. (2020). Data from: Climate implications of electrification projects in the developing world: a systematic review. Duke Research Data Repository. https://doi.org/10.7924/r42n55g1z
  • Cardones, A. R., Hall, III, R. P., Sullivan, K., Hooten, J., Lee, S. Y., Liu, B. L., Green, C., Chao, N., Rowe Nichols, K., Bañez, L., Shah, A., Leung, N., & Palmeri, M. L. (2020). Data from: Quantifying skin stiffness in graft-versus-host disease, morphea and systemic sclerosis using acoustic radiation force impulse imaging and shear wave elastography. Duke Research Data Repository. https://doi.org/10.7924/r4h995b4q
  • Caves, E., Schweikert, L. E., Green, P. A., Zipple, M. N., Taboada, C., Peters, S., Nowicki, S., & Johnsen, S. (2020). Data and scripts from: Variation in carotenoid-containing retinal oil droplets correlates with variation in perception of carotenoid coloration. Duke Research Data Repository. https://doi.org/10.7924/r4jw8dj9h
  • DiGiacomo, A. E., Bird, C. N., Pan, V. G., Dobroski, K., Atkins-Davis, C., Johnston, D. W., Ridge, J. T. (2020). Data from: Modeling salt marsh vegetation height using Unoccupied Aircraft Systems and Structure from Motion. Duke Research Data Repository. https://doi.org/10.7924/r4w956k1q
  • Hall, III, R. P., Bhatia, S. M., Streilein, R. D. (2020). Data from: Correlation of IgG autoantibodies against acetylcholine receptors and desmogleins in patients with pemphigus treated with steroid sparing agents or rituximab. Duke Research Data Repository. https://doi.org/10.7924/r4rf5r157
  • Jin, Y., Ru, X., Su, N., Beratan, D., Zhang, P., & Yang, W. (2020). Data from: Revisiting the Hole Size in Double Helical DNA with Localized Orbital Scaling Corrections. Duke Research Data Repository. https://doi.org/10.7924/r4k072k9s
  • Kaleem, S. & Swisher, C. B. (2020). Data from: Electrographic Seizure Detection by Neuro ICU Nurses via Bedside Real-Time Quantitative EEG. Duke Research Data Repository. https://doi.org/10.7924/r4mp51700
  • Yi, G. & Grill, W. M. (2020). Data and code from: Waveforms optimized to produce closed-state Na+ inactivation eliminate onset response in nerve conduction block. Duke Research Data Repository. https://doi.org/10.7924/r4z31t79k
  • Flanagan, N., Wang, H., Winton, S., Richardson, C. (2020). Data from: Low-severity fire as a mechanism of organic matter protection in global peatlands: thermal alteration slows decomposition. Duke Research Data Repository. https://doi.org/10.7924/r4s46nm6p
  • Gunsch, C. (2020). Data from: Evaluation of the mycobiome of ballast water and implications for fungal pathogen distribution. Duke Research Data Repository. https://doi.org/10.7924/r4t72cv5v
  • Warnell, K., & Olander, L. (2020). Data from: Opportunity assessment for carbon and resilience benefits on natural and working lands in North. Carolina. Duke Research Data Repository. https://doi.org/10.7924/r4ww7cd91

Fun with sticky scrolling

For the past several weeks I’ve had the great fortune to help contribute to our new archival finding aid interface, based on the Stanford ArcLight project. My coworker Sean Aery, is a contributor to that project as well as being the lead developer for Duke University Libraries’ implementation.

The new site is set to launch next week (July 1st) and we are all very excited about it. You can read a teaser post about it written by the product owner, Noah Huffman.

Last week we were trying to work out a thorny issue with the overall interface. There was a feature request from the product owner team to make a section of the navigation ‘sticky’ while allowing other parts of the interface to scroll normally. We have a similar setup for viewing results in our catalog, but that was implemented in an ‘older’ way using extra markup and javascript. Support for position:sticky; across the major browsers is now at point that we could try implementing this feature in a much more simple way, so that’s what we did!

Video of scrolling behavior (with scroll bars showing)

Everything was working great with our implementation, except I really wanted to figure out a way to hide the scroll bars when they weren’t being used — but the most reliable way to do so seemed to involve some flavor of third-party javascript library and I didn’t want to go down that road. In chatting with Sean about it over Slack, he wasn’t seeing the scroll bar problems that I was. I was completely flummoxed! Our development environments are more or less identical. He’s running on a newer version of MacOS than me, but our browser versions are the same. I just couldn’t wrap my head around why I was seeing different behavior with the scroll bars.

However, some googling revealed that there was a setting in macOS for the scroll bars which I was completely unfamiliar with:

screenshot of scrollbar settings
Screenshot of the scroll bar settings options in macOS

I didn’t recall having ever changed that in the past, so I checked with Sean and his was set the same as mine. This felt like the right track, but I still couldn’t imagine what was going on.

Sean also mentioned he was using a trackpad to browse, whereas I have an external mouse attached to my machine. On a whim I tried unhooking the mouse and restarting my computer — and sure enough, that did the trick!

 

See the Pen
MWKvmbV
by Michael Daul (@mikedaul)
on CodePen.

Set the zoom level of the embedded view above to 0.5x to play with the scrolling

 

So the lesson of this story is… if you ever encounter unexpected behavior with scroll bars while doing development work on your Mac, make sure to check your settings and/or account for your use of an external mouse!