Category Archives: Behind the Scenes

Sharing data and research in a time of global pandemic, Part 2

[Header image from Fischer, E., Fischer, M., Grass, D., Henrion, I., Warren, W., Westman, E. (2020, August 07). Low-cost measurement of facemask efficacy for filtering expelled droplets during speech. Science Advances. https://advances.sciencemag.org/content/early/2020/08/07/sciadv.abd3083]

Back in March, just as things were rapidly shutting down across the United States, I wrote a post reflecting on how integral the practice of sharing and preserving research data would be to any solution to the crisis posed by COVID-19. While some of the language in that post seems a bit naive in retrospect (particularly the bit about RDAP’s annual meeting being one of the last in-person conferences of just the spring, as opposed to the entire calendar year!), the emphasis on the importance of rapid and robust data sharing has stood the test of time. In late June, the Research Data Alliance released a set of recommendations and guidelines for sharing research data under circumstances shaped by COVID-19, and a number of organizations, including the National Institutes of Health, have established portals for finding data related to the disease. Access to data has been forefront in the minds of many researchers.

Perhaps in response to this general sentiment (or maybe because folks haven’t been able to access their labs?!), we in the Libraries have seen a notable increase in the number of submissions to our Research Data Repository for data publication. These datasets have derived from a broad range of disciplines, spanning Environmental Sciences to Dermatology. I wanted to use this blog post as an opportunity to highlight a few of our accessions from the last several months.

One of our most prolific sources of data deposits has historically been the lab of Dr. Patrick Charbonneau, associate professor of Chemistry and Physics. Dr. Charbonneau’s lab investigates glass and its physical properties and contributes to a project known as The Simons Collaboration on Cracking the Glass Problem, which addresses issues like disorder, nonlinear response and far-from-equilibrium dynamics. The most recent contribution from Dr. Charbonneau’s research group, published just last week, is fairly characteristic of the materials we receive from Dr. Charbonneau’s group. It contains the raw binary observational data and scripts that were used to create the figures which appear in the researcher’s article. Making these research products available helps other scholars to repeat or reproduce (and thereby strengthen) the findings elucidated in an associated research publication.

Fig01 / Fig02b, Data from: Finite-dimensional vestige of spinodal criticality above the dynamical glass transition

 

Another recent data deposit—a first of its kind for the RDR—is a Q-sort concourse for the Human Dimensions of Large Marine Protected Areas project, which investigates the formulation of large marine protected areas (defined by the project as “any ocean area larger than 100,000 km² that has been designated for the purpose of conservation”) as a global movement. Q-methodology is a psychology and social sciences research method used to study viewpoints. In this study, 40 interviewees were asked to evaluate statements related to large-scale marine protected areas. Q-sorts can be particularly helpful when researchers wish to describe subjective viewpoints related to an issue.

Q sort record sheet from: Q-Sort Concourse and Data for the Human Dimensions of Large MPAs project

Finally, perhaps our most timely deposit has come from a group investigating an alternate method to evaluate the efficacy of masks to reduce the transmission of respiratory droplets during regular speech. “Low-cost measurement of facemask efficacy for filtering expelled droplets during speech,” published last week in Science Advances, is a proof-of-concept study that proposes an optical measurement technique that the group asserts is both inexpensive and easy to use. Because the topic of measuring mask efficiency is still both complex and unsettled, the group hopes this work will help improve evaluation in order to guide mask selection and policy decisions.

Screenshot of Speaker1_None_05.mp4, Video data from: Low-cost measurement of facemask efficacy for filtering expelled droplets during speech

The dataset consists of a series of movie recordings, that capture an operator wearing a face mask and speaking in the direction of an expanded laser beam inside a dark enclosure. Droplets that propagate through the laser beam scatter light, which is then recorded with a cell phone camera. The group tested 12 kinds of masks (see below), and recorded 2 sets of controls with no masks. 

Figure 2 from Low-cost measurement of facemask efficacy for filtering expelled droplets during speech

We hope to keep up the momentum our data management, curation, and publication program has gained over the last few months, but we need your help! For more information on using the Duke Research Data Repository to share and preserve your data, please visit our website, or drop up a line at datamangement@duke.edu. A full list of the datasets we’ve published since moving to fully remote operations in March is available below.

  • Zhang, Y. (2020). Data from: Contributions of World Regions to the Global Tropospheric Ozone Burden Change from 1980 to 2010. Duke Research Data Repository. https://doi.org/10.7924/r40p13p11
  • Campbell, L. M., Gray, N., & Gruby, R. (2020). Data from: Q-Sort Concourse and Data for the Human Dimensions of Large MPAs project. Duke Research Data Repository. https://doi.org/10.7924/r4j38sg3b
  • Berthier, L., Charbonneau, P., & Kundu, J. (2020). Data from: Finite-dimensional vestige of spinodal criticality above the dynamical glass transition. Duke Research Data Repository. https://doi.org/10.7924/r4jh3m094
  • Fischer, E., Fischer, M., Grass, D., Henrion, I., Warren, W., Westman, E. (2020). Video data files from: Low-cost measurement of facemask efficacy for filtering expelled droplets during speech. Duke Research Data Repository. V2 https://doi.org/10.7924/r4ww7dx6q
  • Lin, Y., Kouznetsova, T., Chang, C., Craig, S. (2020). Data from: Enhanced polymer mechanical degradation through mechanochemically unveiled lactonization. Duke Research Data Repository. V2 https://doi.org/10.7924/r4fq9x365
  • Chavez, S. P., Silva, Y., & Barros, A. P. (2020). Data from: High-elevation monsoon precipitation processes in the Central Andes of Peru. Duke Research Data Repository. V2 https://doi.org/10.7924/r41n84j94
  • Jeuland, M., Ohlendorf, N., Saparapa, R., & Steckel, J. (2020). Data from: Climate implications of electrification projects in the developing world: a systematic review. Duke Research Data Repository. https://doi.org/10.7924/r42n55g1z
  • Cardones, A. R., Hall, III, R. P., Sullivan, K., Hooten, J., Lee, S. Y., Liu, B. L., Green, C., Chao, N., Rowe Nichols, K., Bañez, L., Shah, A., Leung, N., & Palmeri, M. L. (2020). Data from: Quantifying skin stiffness in graft-versus-host disease, morphea and systemic sclerosis using acoustic radiation force impulse imaging and shear wave elastography. Duke Research Data Repository. https://doi.org/10.7924/r4h995b4q
  • Caves, E., Schweikert, L. E., Green, P. A., Zipple, M. N., Taboada, C., Peters, S., Nowicki, S., & Johnsen, S. (2020). Data and scripts from: Variation in carotenoid-containing retinal oil droplets correlates with variation in perception of carotenoid coloration. Duke Research Data Repository. https://doi.org/10.7924/r4jw8dj9h
  • DiGiacomo, A. E., Bird, C. N., Pan, V. G., Dobroski, K., Atkins-Davis, C., Johnston, D. W., Ridge, J. T. (2020). Data from: Modeling salt marsh vegetation height using Unoccupied Aircraft Systems and Structure from Motion. Duke Research Data Repository. https://doi.org/10.7924/r4w956k1q
  • Hall, III, R. P., Bhatia, S. M., Streilein, R. D. (2020). Data from: Correlation of IgG autoantibodies against acetylcholine receptors and desmogleins in patients with pemphigus treated with steroid sparing agents or rituximab. Duke Research Data Repository. https://doi.org/10.7924/r4rf5r157
  • Jin, Y., Ru, X., Su, N., Beratan, D., Zhang, P., & Yang, W. (2020). Data from: Revisiting the Hole Size in Double Helical DNA with Localized Orbital Scaling Corrections. Duke Research Data Repository. https://doi.org/10.7924/r4k072k9s
  • Kaleem, S. & Swisher, C. B. (2020). Data from: Electrographic Seizure Detection by Neuro ICU Nurses via Bedside Real-Time Quantitative EEG. Duke Research Data Repository. https://doi.org/10.7924/r4mp51700
  • Yi, G. & Grill, W. M. (2020). Data and code from: Waveforms optimized to produce closed-state Na+ inactivation eliminate onset response in nerve conduction block. Duke Research Data Repository. https://doi.org/10.7924/r4z31t79k
  • Flanagan, N., Wang, H., Winton, S., Richardson, C. (2020). Data from: Low-severity fire as a mechanism of organic matter protection in global peatlands: thermal alteration slows decomposition. Duke Research Data Repository. https://doi.org/10.7924/r4s46nm6p
  • Gunsch, C. (2020). Data from: Evaluation of the mycobiome of ballast water and implications for fungal pathogen distribution. Duke Research Data Repository. https://doi.org/10.7924/r4t72cv5v
  • Warnell, K., & Olander, L. (2020). Data from: Opportunity assessment for carbon and resilience benefits on natural and working lands in North. Carolina. Duke Research Data Repository. https://doi.org/10.7924/r4ww7cd91

EDTF-Humanize 2.0 with Improved Internationalization Support

About four years ago we released a small Ruby gem (EDTF-Humanize) to generate human readable dates out of Extended Date Time Format dates. For some background on our use of the EDTF standard, please see our previous blog posts on the topic: EDTF-Humanize, Enjoy your Metadata: Fun with Date Encoding, and It’s Date Night Here at Digital Projects and Production Services.

Some recent community contributions to the gem as well as some extra time as we transition from one work cycle to another provided an opportunity for maintenance and refinement of EDTF-Humanize. The primary improvement is better support for languages other than English via Ruby I18n locale configuration files and a language specific module override pattern. Support for French is now included and support for other languages may be added following the same approach as French.

The primary means of adding additional languages to EDTF-Humanize is to add a translation file to config/locals/. This is the translation file included to support French:

fr:
  date:
    day_names: [Dimanche, Lundi, Mardi, Mercredi, Jeudi, Vendredi, Samedi]
    abbr_day_names: [Dim, Lun, Mar, Mer, Jeu, Ven, Sam]
    # Don't forget the nil at the beginning; there's no such thing as a 0th month
    month_names: [~, Janvier, Février, Mars, Avril, Mai, Juin, Juillet, Août, Septembre, Octobre, Novembre, Decembre]
    abbr_month_names: [~, Jan, Fev, Mar, Avr, Mai, Jun, Jul, Aou, Sep, Oct, Nov, Dec]
    seasons:
      spring: "printemps"
      summer: "été"
      autumn: "automne"
      winter: "hiver"
  edtf:
    terms:
      approximate_date_prefix_day: ""
      approximate_date_prefix_month: ""
      approximate_date_prefix_year: ""
      approximate_date_suffix_day: " environ"
      approximate_date_suffix_month: " environ"
      approximate_date_suffix_year: " environ"
      decade_prefix: "Les années "
      decade_suffix: ""
      century_suffix: ""
      interval_prefix_day: "Du "
      interval_prefix_month: "De "
      interval_prefix_year: "De "
      interval_connector_approximate: " à "
      interval_connector_open: " à "
      interval_connector_day: " au "
      interval_connector_month: " à "
      interval_connector_year: " à "
      interval_unspecified_suffix: "s"
      open_start_interval_with_day: "Jusqu'au %{date}"
      open_start_interval_with_month: "Jusqu'en %{date}"
      open_start_interval_with_year: "Jusqu'en %{date}"
      open_end_interval_with_day: "Depuis le %{date}"
      open_end_interval_with_month: "Depuis %{date}"
      open_end_interval_with_year: "Depuis %{date}"
      set_dates_connector_exclusive: ", "
      set_dates_connector_inclusive: ", "
      set_earlier_prefix_exclusive: 'Le ou avant '
      set_earlier_prefix_inclusive: 'Le et avant '
      set_last_date_connector_exclusive: " ou "
      set_last_date_connector_inclusive: " et "
      set_later_prefix_exclusive: 'Le ou après '
      set_later_prefix_inclusive: 'Le et après '
      set_two_dates_connector_exclusive: " ou "
      set_two_dates_connector_inclusive: " et "
      uncertain_date_suffix: "?"
      unknown: 'Inconnue'
      unspecified_digit_substitute: "x"
    formats:
      day_precision_strftime_format: "%-d %B %Y"
      month_precision_strftime_format: "%B %Y"
      year_precision_strftime_format: "%Y"

In addition to the translation file, the methods used to construct the human readable string for each EDTF date object type may be completely overridden for a language if needed. For instance, when the date object is an instance of EDTF::Century the French language uses a different method from the default to construct the humanized form. This override is accomplished by adding a language module for the French language that includes the Default module and also includes a Century module that overrides the default behavior. The override is here (minus the internals of the humanizer method) as an example:

# lib/edtf/humanize/language/french.rb
module Edtf
  module Humanize
    module Language
      module French
        include Default
        module Century
          extend self

          def humanizer(date)
            # Special French handling for EDTF::Century
          end
        end
      end
    end
  end
end

EDTF-Humanize version 2.0.0 is available on rubygems.org and on GitHub. Documentation is available on GitHub. Pull requests are welcome; I’m especially interested in contributions to add support for languages in addition to English and French.

In a (Temporary) Time of Remote Work, Duke’s FOLIO Implementation Continues

Duke University is an early adopter for FOLIO, an open source library services platform that will give us tools to better support the information needs of our students, faculty, and staff. A core team in Library Systems and Integration Support began forming in January 2019 to help Duke move to FOLIO. I joined that team in January 2019 and began work as an IT Business Analyst.

In preparation for going-live with FOLIO, we formally kicked off our local implementation effort in January 2020. More than 40 local subject experts have joined small group teams to work on different parts of the FOLIO project. These experts are invaluable to Library IT staff: they know how the library’s work is done, which features need to be prioritized over others, and are committed to figuring out how to transition their work into the FOLIO environment.

If you’re reading this in April 2020 and thinking “wasn’t January ten years ago?” you’re not alone. Because the FOLIO Project is international, with partners all over the world, many of us are used to working via remote tools like Slack, Microsoft Teams, and Zoom. But that is a far cry from doing ALL of our work that way, while also taking care of our families and ourselves. It’s a huge credit to all library staff that while the University was swiftly pivoting to remote work, we were able to keep our implementation work going.

One of the first big, messy areas that we knew we needed to work on was using locations.

Locations are essential to how patrons know where an item is at the Duke Libraries. When you look up a book in our catalog and the system tells you Where to Find It, it’s using location information from our systems. Library staff also use locations to understand how often items are borrowed, decide when to move items to our off-campus storage, and decide when we to buy new items to keep our collections up to date.

A group of FOLIO team members came together from different working areas, including public services, cataloging, acquisitions, digital resources and assessment. I convened those discussions as a lead for our Configurations team. Over the course of late February and March 2020, we met three times as a group using Zoom and delved deep into learning about locations in our current system and how they will work in FOLIO. Staff members shared their knowledge with each other about their functional areas, allowing us to identify potential gaps in FOLIO functionality, as well as things we could improve now, without waiting for FOLIO to deploy.

This team identified two potential paths forward – one that was straightforward, and one that was more creative and would adapt the FOLIO four-level locations in a new way.  In our final meeting – where we had hoped to decide between the two options, our subject experts grappled with the challenges, risks and rewards of the two choices and were able to recommend a path forward together. Ultimately, the team agreed that the creative option was the best choice, but both options would work – and that guidance helped us decide how to make a first pass on configuring locations and move the project forward.

The most important part of these meetings was valuing the expertise of our library staff and working to support them as they decided what would work the best for the library’s needs.  I am deeply appreciative of the staff who committed the time to these discussions while also figuring out how to move their regular jobs to remote work. Our FOLIO implementation is all the better because of their collaborative spirit.

The New Books & Media Catalog Turns One

It’s been just over a year since we launched our new catalog in January of 2019. Since then we’ve made improvements to features, performance, and reliability, have developed a long term governance and development strategy, and have plans for future features and enhancements.

During the Spring 2019 semester we experienced a number of outages of the Solr index that powers the new catalog. It proved to be both frustrating and difficult to track down the root cause of the outages. We took a number of measures to reduce the risk of bot traffic slowing down or crashing the index. A few of these measures include limiting facet paging to 50 pages and results paging to 250 pages, as well as setting limits on OpenSearch queries. We also added service monitoring so we are automatically alerted when things go awry and automatic restarts under some known bad system conditions. We also identified that a bug in the version of Solr we were running was vulnerable to causing crashes for queries with particular characteristics. We have since applied a patch to Solr to address this bug. Happily, the index has not crashed since we implemented these protective measures and bug fixes.

Over the past year we’ve made a number of other improvements to the catalog including:

  • Caching of the home page and advanced search page have reduced page load times by 75%.
  • Subject searches are now more precise and do not include stemmed terms.
  • CDs and DVDs can be searched by accession number.
  • When digitized copies of Duke material are available at the Internet Archive, links to the digital copy are automatically generated.
  • Records can be saved to a bookmarks list and shared with others via a stable URL.
  • Eligible records now have a “Request digitization” link.
  • Many other small improvements and bug fixes.

We sometimes get requests for features that the catalog already supports:

While development has slowed, the core TRLN team meets monthly to discuss and prioritize new features and fixes, and dedicates time each month to maintenance and new development. We have a number of small improvements and bug fixes in the works. One new feature we’re working on is adding a citation generator that will provide copyable citations in multiple formats (APA, MLA, Chicago, Harvard, and Turabian) for records with OCLC numbers.

We welcome, read, and respond to all feedback and feature requests that come to us through the catalog’s feedback form. Let us know what you think.

Check out “Search Tips” and “Expert Search Tips” for detailed information about how to get the most out of the new catalog.

Duke Digital Repository Evolution and a new home page

After nearly a year of work, the libraries recently launched an updated version of the software stack that powers parts the Duke Digital Repository. This work primarily centered around migrating the underlying software in our Samvera implementation — which we use to power the DDR — from ActiveFedora to Valkyrie. Moving to Valkyrie gives us the benefits of improved stability along with the flexibility to use different storage solutions, which in turn provides us with options and some degree of future-proofing. Considerable effort was also spent on updating the public and administrative interfaces to use more recent versions of blacklight and supporting software.

ddr admin interface
Administrative interface for the DDR

We also used this opportunity to revise the repository landing page at repository.duke.edu and I was involved in building a new version of the home page. Our main goals were to make use of a header implementation that mirrored our design work in other recent library projects and that integrated our ‘unified’ navigation, while also maintaining the functionality required by the Samvera software.

Old DDR Homepage
DDR home page before the redesign

We also spent a lot of time thinking about how best to illustrate the components of the Duke Digital Repository while trying to keep the content simple and streamlined. In the end we went with a design that emphasizes the two branches of the repository; Library Collections and Duke Scholarship. Each branch in turn links to two destinations — Digitized Collections / Acquired Materials and the Research Data Repository / DukeSpace. The overall design is more compact than before and hopefully an improvement aesthetically as well.

new DDR homepage
Redesigned DDR home page

We also incorporated a feedback form that is persistent across the interface so that users can more readily report any difficulties they encounter while using the platform. And finally, we updated the content in the footer to help direct users to the content they are more than likely looking for.

Future plans include incorporating our header and footer content more consistently across the repository platforms along with bringing a more unified look and feel to interface components.

Check out the new design and let us know what you think!

All About that Time Base

The video digitization system in Duke Libraries’ Digital Production Center utilizes many different pieces of equipment: power distributors, waveform and vectorscope monitors, analog & digital routers, audio splitters & decibel meters, proc-amps, analog (BNC, XLR and RCA) to digital (SDI) converters, CRT & LCD video monitors, and of course an array of analog video playback decks of varying flavors (U-matic-NTSC, U-matic-PAL, Betacam SP, DigiBeta, VHS-NTSC and VHS-PAL/SECAM). We also transfer content directly from born-digital DV and MiniDV tapes.

A grandfather clock is a time base.

One additional component that is crucial to videotape digitization is the Time Base Corrector (TBC). Each of our analog video playback decks must have either an internal or external TBC, in order to generate an image of acceptable quality. At the recent Association of Moving Image Archivist’s Conference in Baltimore, George Blood (of George Blood Audio/Video/Film/Data) gave a great presentation on exactly what a Time Base Corrector is, appropriately entitled “WTF is a TBC?” Thanks to George for letting me relay some of his presentation points here.

A time base is a consistent reference point that one can utilize to stay in sync. For example, The Earth rotating around the Sun is a time base that the entire human race relies on, to stay on schedule. A grandfather clock is also a time base. And so is a metronome, which a musical ensemble might use to all stay “in time.”

Frequency is defined as the number of occurrences of a repeating event per unit of time. So, the frequency of the Earth rotating around the Sun is once per 24 hrs. The frequency of a grandfather clock is one pendulum swing per second. The clock example can also be defined as one “cycle per second” or one hertz (Hz), named after Heinrich Hertz, who first conclusively proved the existence of electromagnetic waves in the late 1800’s.

One of the DPC’s external Time Base Correctors

But anything mechanical, like grandfather clocks and videotape decks, can be inconsistent. The age and condition of gears and rods and springs, as well as temperature and humidity, can significantly affect a grandfather clock’s ability to display the time correctly.

Videotape decks are similar, full of numerous mechanical and electrical parts that produce infinite variables in performance, affecting the deck’s ability to play the videotape’s frames-per-second (frequency) in correct time.

NTSC video is supposed to play at 29.97 frames-per-second, but due to mechanical and electro-magnetic variables, some frames may be delayed, or some may come too fast. One second of video might not have enough frames, another second may have too many. Even the videotape itself can stretch, expand and contract during playback, throwing off the timing, and making the image wobbly, jittery, too bright or dark, too blue, red or green.

A Time Base Corrector does something awesome. As the videotape plays, the TBC stores the unstable video content briefly, fixes the timing errors, and then outputs the corrected analog video signal to the DPC’s analog-to-digital converters. Some of our videotape decks have internal TBCs, which look like a computer circuit board (shown below). Others need an external TBC, which is a smaller box that attaches to the output cables coming from the videotape deck (shown above, right). Either way, the TBC can delay or advance the video frames to lock them into correct time, which fixes all the errors.

An internal Time Base Corrector card from a Sony U-matic BVU-950 deck

An internal TBC is actually able to “talk” to the videotape deck, and give it instructions, like this…

“Could you slow down a little? You’re starting to catch up with me.”

“Hey, the frames are arriving at a strange time. Please adjust the timing between the capstan and the head drum.”

“There’s a wobble in the rate the frames are arriving. Can you counter-wobble the capstan speed to smooth that out?”

“Looks like this tape was recorded with bad heads. Please increase gain on the horizontal sync pulse so I can get a clearer lock.”

Without the mighty TBC, video digitization would not be possible, because all those errors would be permanently embedded in the digitized file. Thanks to the TBC, we can capture a nice, clean, stable image to share with generations to come, long after the magnetic videotape, and playback decks, have reached the end of their shelf life.

FOLIO Update November 2019

Here at Duke, the buzz continues around FOLIO. We have continued to contribute to the international project  as active participants on the FOLIO Product Council,  special interest groups, and contribute development resources. You can find links to the various groups on the FOLIO wiki.

We’ve also committed to implementing the electronic resource management (ERM)-focused apps in summer of 2020. Starting with the ERM-focused apps will give us the opportunity to use FOLIO in a production environment, and will be a benefit to our Continuing Resource Acquisitions Department since they are not currently using software dedicated to electronic resources to keep track of licences and terms.

 

Our local project planning has come more into focus as well. We have gathered names for team participants and will be kicking off our project teams in January. As we’ve talked about the implementation here, we’ve realized that we have a number of tasks that will need to be addressed, regardless of subject matter. For example, we’re going to need to map data – not just bibliographic, holdings and item data, but users, orders, invoices, etc. We’ll also need to set up configurations and user permissions for each of the apps, and document, train, and develop new workflows. Since our work is not siloed in functional areas, we need to facilitate discussions among the functional areas. To do that, we’re going to create a set of functional area implementation teams, and work groups around the task areas that need to be addressed.

To learn more about the FOLIO project at Duke, fly on over to our WordPress site and read through our past newsletters, look through slides from past presentations, and check out some fun links to bee facts.

A Statement of Commitment

The featured image is from a mockup of a new repositories home page that we’re working on in the Libraries, planned for rollout in January of 2020.

Working at the Libraries, it can be dizzying to think about all of our commitments.

There’s what we owe our patrons, a body of so many distinct and overlapping communities, all seeking to learn and discover, that we could split the library along an infinite number of lines to meet them where they work and think.

There’s what we owe the future, in our efforts to preserve and share the artifacts of knowledge that we acquire on the market, that scholars create on our own campus, or that seem to form from history and find us somehow.

There’s what we owe the field, and the network of peer libraries that serve their own communities, each of them linked in a web of scholarship with our own. Within our professional network, we seek to support and complement one another, to compete sometimes in ways that move our field forward, and to share what we learn from our experiences.

The needs of information technology underlie nearly all of these activities, and to meet those needs, we have an IT staff that’s modest in size, but prodigious in its skill and its dedication to the mission of the Libraries. Within that group, the responsibility for creating new software, and maintaining what we have, falls to a small team of developers and devops engineers. We depend on them to enhance and support a wide range of platforms, including our web services, our discovery platforms, and our digital repositories.

This fall, we did some reflection on how we want to approach support for our repository platforms. The result of that reflection was a Statement of Commitment to Repositories Support and Development, a document of roughly a page that expresses what we consider to be our values in this area, and the context of priorities in which we do that work.

The committee that created the statement was our Digital Preservation and Publishing Program, or DP3 as call it in house. We summarized our values as “openness, community and peer engagement, and independence from vended platforms,” which have “guided us to build our repositories on open source software platforms.” We place that work within the context of very large, looming priorities like our transition to FOLIO as our Library Services Platform, and the project to renovate Lilly Library. There are others, not mentioned in the statement, that fill the pages of this blog.

The statement is explicit that we will not seek to find alternative platforms for our repository services in the next several years, and in particular while the FOLIO transition is underway. This decision is informed by our recognition that migration of content and services across platforms is complex and expensive. It’s also a recognition that we have invested a lot into these existing platforms, and we want to carve out as much space as we can for our talented staff to focus on maintaining and improving them, rather than locking ourselves into all-consuming cycles of content migration.

From a practical perspective, and speaking as the manager who oversees software development in the Libraries, I see this statement as part of an overall strategy to bring focus to our work. It’s a small but important symbolic measure that recognizes the drag that we create for our software team when give in to our urge to prioritize everything. 

The phrase “context switching” is one that we have borrowed from the parlance of operating systems to describe the effects on a developer of working on multiple projects at once. There are real costs to moving between development environments, code bases, and architectures on the same day, in the same week, during the same sprint, or within even an extended work cycle. We also call this problem “multi-tasking,” and the penalty it imposes of performance is well documented

Even more than performance, I think of it as a quality of life concern. People are generally happier and more invested when they’re able to do quality work. As a manager, I can work with scheduling and planning to try to mitigate those effects of multitasking on our team. But the responsibility really lies with the organization. We have our commitments, and they are vast in size and scope. We owe it to ourselves to do some introspection now and again, and ask what we can realistically do with what we have, or more accurately, who we are.

Lighting and the PhaseOne: It’s More Than Point and Shoot

Last week, I went to go see the movie IT: Chapter 2. One thing I really appreciated about the movie was how it used a scene’s lighting to full effect. Some scenes are brightly lit to signify the friendship among the main characters. Conversely, there are dark scenes that signify the evil Pennywise the Clown. For the movie crew, no doubt it took a lot of time and manpower to light an individual scene – especially when the movie is nearly 3 hours long.

We do the same type of light setup and management inside the Digital Production Center (DPC) when we take photos of objects like books, letters, or manuscripts. Today, I will talk specifically about how we light the bound material that comes our way, like books or booklets. Generally, this type of material is always going to be shot on our PhaseOne camera, so I will particularly highlight that lighting setup today.

Before We Begin

It’s not enough to just turn the lights on in our camera room to do the trick. In order to properly light all the things that need to be shot on the PhaseOne, we have specific tools and products we use that you can see in the photo below.

We have 4 high-powered lights (two sets of two Buhl SoftCube SC-150 models) pointed directly in the camera’s field of view. There are two on the right and two on the left. These are stationed approximately 3.5 feet off the ground and approximately 2.5 feet away from the objects themselves. These lights are supported by Avenger A630B light stands. They allow for a wide range of movement, extension, and support if we need them.

But if bright, hot lights were pointed directly at sensitive documents for hours, it would damage them. So light diffusers are necessary. For both sets of lights, we have 3 layers of material to diffuse the light and prevent material from warping or text from fading. The first layer, directly attached to the light box itself, is an inexpensive sheet of diffusion fabric. This type of material is often made from nylon or silk, and are usually inexpensive.

The second diffusion layer is an FJ Westcott Scrim Jim, a similar thin fabric that is attached to a lightweight stand-up frame, the Manfrotto 156BLB. This frame can also be moved or extended if need be. The last layer is another sheet of diffusion fabric, attached to a makeshift “cube” held up by lightweight wooden rods. This cube can be picked up or carried, making it very convenient if we need to eventually move our lights.

So in total, we have 4 lights, 4 layers of diffusion fabric attached to the light boxes, two Scrim Jims, and the cube featuring 2 sides of additional diffusion fabric. After having all these items stationed, surely we can start taking pictures, right? Not yet.

Around the Room

There are still more things to be aware of – this time in the camera room itself. We gently place the materials themselves on a cradle lined with a black felt, similar to velvet. This cradle is visible in the bottom right part of the photo above. It is placed on top of a table, also coated in black felt. This is done so no background colors bounce back or reflect onto the object and change what it looks like in the final image itself. The walls of the camera room are also painted a neutral grey color for the same reason, as you can see in the background of the above photo. Finally, any tiny reflective segments between the ceiling tiles have been blacked out with gaffer tape. Having the room this muted and intentionally dark also helps us when we have to shoot multi-spectral images. No expense has been spared to make sure our colors and photos are correct.

Camera Settings

With all these precautions in place, can we finally take photos of our materials? Almost. Before we can start photographing, we have to run some tests to make sure everything looks correct to our computers. After making sure our objects are sharp and in focus, we use a program called DTDCH (see the photo to the right) to adjust the aperture and exposure of the PhaseOne so that nothing appears either way too dim or too bright. In our camera room, we use a PhaseOne IQ180 with a Schneider Kreuznach Apo-Digitar lens (visible in the top-right corner of the photo above). We also use the program CaptureOne to capture, save, and export our photos.

Once the shot is in focus and appropriately bright, we will check our colors against an X-Rite ColorChecker Classic card (see the photo on the left) to verify that our camera has a correct white balance. When we take a photo of the ColorChecker, CaptureOne displays a series of numbers, known as RGB values, found in the photo’s colors. We will check these numbers against what they should be, so we know that our photo looks accurate. If these numbers match up, we can continue. You could check our work by saving the photo on the left and opening it in a program like Adobe Photoshop.

Finally, we have specific color profiles that the DPC uses to ensure that all our colors appear accurate as well. For more information on how we consistently calibrate the color in our images, please check out this previous blog post.

After all this setup, now we can finally shoot photos! Lighting our materials for the PhaseOne is a lot of hard work and preparation. But it is well worth it to fulfill our mission of digitizing images for preservation.