Better than joining the CHORUS - Scholarly Communications @ Duke

Last week we saw two proposals about how the various federal agencies that fund research might implement the recent directive from the White House Office of Science and Technology Policy that mandates public access to the products of funded research. A group of publishers unveiled (sort of) a proposal they call CHORUS, while the Association of American Universities, the Association of Research Libraries and the Association of Public and Land-grant Universities collaborated on a different proposal, referred to as SHARE.

The publishers proposal — the acronym stands for Clearing House for the Open Research of the United States — is described in glowing terms on the Scholarly Kitchen website and with a bit more restraint by the Chronicle of Higher Education. The proposal from the education associations, dubbed Shared Access Research Ecosystem, is also described by the Chronicle and is the subject of a detailed draft proposal that can be found here.

For myself, I would rather SHARE than join the CHORUS, for a number of reasons.

First, I think CHORUS is being touted, at least in what I have read, by comparing it to a straw man. Its principle virtue seems to be that it would not cost the government as much as setting up lots of government-run repositories, clones of PubMed Central. But it is not clear that that option is being seriously suggested by anyone. Certainly many of us encouraged the agencies to look at the benefits of PMC for inspiration and not sacrifice those benefits in their own plans, but that does not mean that each agency must “reinvent the wheel,” no matter how successful that wheel has been. So the principle virtue of CHORUS seems to be that it does not do what no one is suggesting be done.

The most important thing to understand about CHORUS is that it is a dark archive. The research papers in CHORUS would not be directly accessible to anyone; they would be “illuminated” only if a “trigger event” occurred. Routine access would, instead, be provided on the proprietary platforms of each publisher, while the CHORUS site would simply collect metadata about the openly-accessible articles and point researchers to the specific publisher platforms.

It seems to me that the CHORUS proposal is “disabled” from the start, by which I mean that it lacks three fundamental abilities. CHORUS, at least based on the descriptions we have seen, lacks find-ability, useability and interoperability.

Perhaps the most troubling remark in the description offered on the Scholarly Kitchen blog is that “Users can search and discover papers directly from CHORUS.gov or via any integrated agency site.” Does this mean that even the collected metadata would not be available to Google? We know how few researchers “walk through the front door” of our research tools, so limiting discovery to the CHORUS portal or “integrated agency sites” would make these open access papers virtually invisible (which, one suspects, is the point). As things stand now, open access papers which reside on proprietary publisher platforms are difficult to find because there is no consistency in how they can be discovered. That is the principal reason so many COPE fund institutions will not support so-called “hybrid” open access publishing that makes a few articles open on an otherwise toll-access site. It does not seem that CHORUS would change that unfortunate situation at all, which is probably why Heather Joseph of SPARC calls CHORUS “a restatement of the status quo.” The public would gain very little, since the major goal of the proposal is for the publishers to cling tightly to control over the research papers that have been entrusted with.

Another ability that CHORUS would lack is useability, since as far as we know, all that a researcher or other user could do with these papers is read them. It would not, of course, facilitate sharing, teaching or reuse, even those these abilities are vital to improving the speed and quality of research in the United States. And then there is interoperability. If the geographically desperate archives are genuinely federated, searches across all of them, even keyword searches that are not dependent on the metadata created for each article, would be possible. So would text and data mining across a large corpus of works. We already know that such interoperability creates tremendous new opportunities for expanded research, collaboration, and previously impossible discoveries. But there is no reason to believe that CHORUS would support interoperability, since the various publishers have a strong competitive interest in not allowing cross-platform activities. Research and education, however, not only do not benefit from that competition, but are actively “disabled” by it.

On the other hand, the proposal from the universities and their libraries is for a genuinely federated system of university-based repositories. Those repositories already exist, so if we are going to make a cost argument, it really favors SHARE. And these repositories, unlike the publisher platforms, have a strong interest in facilitating discovery. Also, the detailed proposal offered by these groups addresses text and data mining, semantic data, APIs for research and linked data. All of these things make university-based research better, while they pose threats to the commercial publishers who have designed CHORUS to protect themselves, not to benefit research or the public. So all the incentives line up between the public interest and the university-based SHARE system.

If the OSTP and the research-funding agencies take seriously all of the opportunities that were described in the comments they have solicited over the past year, it will be very obvious to them that CHORUS is singing flat, while it would be good to SHARE, just as our parents always told us.