One of Will Rogers’ best known aphorism is “I only know what I read in the papers.” In line with Rogers’ irony, if all one knows about the Aaron Swartz case is what one reads in the blogosphere, one knows very little indeed, and much of it wrong.
Swartz has been indicted on several federal charges after allegedly physically and technologically gaining unauthorized access to the MIT network and downloading a huge number of files from JSTOR. On that everyone agrees. After that the claims about and arguments based on this event diverge dramatically.
Predictably, many bloggers (an example is this one from the Copyright Alliance) call these actions by Swartz “theft” or “stealing.” As always when talking about intellectual property, these words are misapplied. The formal definition of theft from Black’s Law Dictionary is “the felonious taking and removing of another’s personal property with the intent of depriving the true owner of it.” It should be clear from this definition why we call authorized use of intellectual property “infringement” rather than theft. What Swartz is alleged to have done did not remove the intellectual property and showed no intent to deprive the original owner of it; he merely made, allegedly, unauthorized copies, which does not have the effect of depriving anyone else of intangible property. JSTOR was never without these files and they have, in fact, recovered the unauthorized copies.
Whenever someone uses the language of theft in reference to intellectual property, they are trying to cover the weakness of their argument, in my opinion. Let’s just say infringement and talk about both the legitimate reasons to protect IP and the public policy that permits some unauthorized copying.
By the way, Swartz has not been charged with copyright infringement either. The charges of wire fraud, computer fraud and illegally obtaining information from a protected computer all relate to the hacking itself, not to the downloads.
Another place where serious misrepresentations abound is when we are told (as in this post on the Scholarly Kitchen) that Swartz has “done this before” because of a previous incident where he download large numbers of documents from PACER, a database used by the federal courts. That incident, however, involved neither illegal access nor copyright infringement. Although PACER usually charges a fee, Swartz used a computer at a university on which access was being provided for free as an experiment. And the materials he downloaded – documents from the federal courts – are not protected by any copyright due to section 105 of the US copyright law. To be sure, Swartz was protesting the fees charged for access to works created at taxpayer expense for the public good, but his actions in that case have no analogy to the behavior charge in this indictment.
One place where there is significant disagreement is about Swartz’s intentions. Many bloggers simply assume that he intended to release all of the downloaded files to the public, although Swartz claims he intended to do text-mining research with the articles. He has done such work before, so there is some plausibility to his claim, which may explain when infringement charges have not been brought. So turning this into a debate about the open access movement is wholly inappropriate. It is important to recognize that the victim of these alleged crimes was not JSTOR or any of the journals it aggregates. The victim was MIT.
However fervently one shares Swartz’s goals for greater access to legal and scholarly information and publications, the actions for which he has been charged do not serve those goals. Quite frankly, Swartz’s actions were not radical enough, in the sense that they did not get to the root of the problem. It is clear that the system of scholarly dissemination is badly broken, and simply hacking it does not change that fact. The real change, the real solution Swartz (apparently) seeks, will be found only when the academic authors, the original holders of copyright, stop transferring those copyrights to publishers without careful reflection and safeguards on their right to disseminate their own work widely.
I, too, am baffled by the question of Swartz’s intentions. You say that he “claims that he intended to do text-mining research”. I haven’t seen any statements from him about the circumstances of his arrest or what he was actually up to. Can you point me to the source for that?
The possibility that Swartz intended to use the articles for meta-analysis is actually raised by one of his collaborators in such work in the past. See the blog post at http://blogs.reuters.com/mediafile/2011/07/20/the-difference-between-google-and-aaron-swartz/
Thanks — I’d also assumed that as a possibility based on various things he’s done and things I’d read. So your statement that he “claims he intended to do text-mining research with the articles…” is not factually accurate? “Careless language” indeed.
I still lean toward thinking that was the likeliest intent, but then there’s the niggling question of why he didn’t just use the options for text-mining that JSTOR provides. I guess we’ll have to wait until he actually claims something.
http://www.krstarica.com