Library values and the growing scholarly digital divide: In memoriam Aaron Swartz

Why did you decide to become a librarian or work in libraries? For me, like many of us, working in a library wasn’t just an arbitrary job to pay the bills, we have a special affinity for the mission and values of libraries. A mission and values which focus on connecting people to the research and information they need to make informed decisions and actions, through a democratic and egalitarian approach that serves all in need, rather than focusing on maximizing profit that can be extracted from our customers.

In fact, libraries are just about the only ‘information institutions’ whose business interests are centered on aiding our users, not in commodifying our users as demographic data, ‘eyeballs’, or paying customers.

Even the university library has historically been a center of knowledge distribution to the public at wide, not just the university community. Especially — but not only — at public universities, who saw dissemination of knowledge to the citizenry as part of their mission.

Consider the US resident of 30 years ago who wanted access to a scholarly article. She could walk into a university library (at almost all public and many private universities), take a bound journal off the shelf, browse, locate, and read articles of interest, and even (in the post-photocopy world) photocopy an article for personal research uses. This usage pattern was in fact the same one that those affiliated with that university would engage in, the university library served as the hub of scholarly knowledge for the affiliated and non-affiliated alike — at least in the US and the first world.

The digital revolution has changed this. The access to scholarly knowledge we provide is through licensed electronic copies, mostly available only to our affiliates. Those affiliated with paying institutional customers can now access scholarly articles from the comfort of their own homes — but those not affiliated are basically out of luck. Even if the university library has a printed copy of the article in which a non-affiliate might be interested (increasingly unlikely); even if the library provides public entry and public workstations at which the non-affiliate can view licensed electronic articles (if they can wait on line to get on one of the limited public workstations) — their second-class status and access level would be apparent to them: One method of convenient access for affiliates, another method of very inconvenient and frequently impossible access for everyone else.

This change isn’t mainly due to decisions made by libraries, and in fact certainly the economic and cultural changes of the digital revolution are making things very difficult for libraries. In one salient example, libraries largely can’t purchase ebooks for lending even to their own affiliates.

But the fact is, the digital revolution, which would seem to provide the technology to make access to the world’s information ever more widespread, more efficiently and affordably than ever — is instead widening the access gap between the information haves and have-nots. University libraries, which used to serve as the runway for public access to scholarly output — now, in many people’s minds, serve as symbols of their host universities as gated communities with high walls keeping out the information have-nots. (And don’t get me wrong, even our own affiliated patrons aren’t exactly happy with ease of access issues either).

What are libraries and librarians doing about this? What are we saying about it? What could or should we be doing and saying?

* . * . * . * . *

Aaron Swartz was a ‘child prodigy’ of the development of the internet as we know it. Among other things, he was responsible for the RSS 1.0 standard while still a teenager.

And Swartz shared the presumed values of libraries — the widespread and democratic dissemination of human knowledge, the minimization of inequality in information access.

He was involved in the development of Creative Commons, and in the Internet Archive’s Open Library project. (He wrote: “Our goal is to build the world’s greatest library, then put it up on the Internet free for all to use and edit.”)

And in 2009, he targetted PACER, a government-run system which charged significant fees for access to court records. Swartz bulk downloaded these public documents (which are not under copyright), and contributed them to a non-fee-charging public repository, Carl Malumud’s public.research.org. While PACER still has a fee based structure for downloads, fees are waived for a basic level of use, and there’s a Firefox browser extension that lets users forward their free downloads to an Internet Archive collection.

The FBI investigated Swartz for his PACER bulk downloads, but never pressed charges (perhaps because no laws were broken). Swartz’s actions helped bring attention to the restrictions on public access to public documents, and helped contribute a bit to widening access, by both the publicity it brought to the issue, and by his direct action to download and redistribute the documents.

More recently, Swartz was apparently offended by the growing gap in access to scholarly output, and perhaps thinking on analogy to his work with PACER, he set up a system to bulk download as much of the well-regarded non-profit JStor aggregator’s content as he could get. Using methods that sound out of a techno-thriller, he allegedly set up a rogue server in the basement of an MIT building, which, by virtue of being in MIT’s IP address range had access to JStor and proceeded to scrape as many documents as he could.

Of course, the legal and practical situation of this JStor endeavor was quite different than PACER.

He was caught. He was arrested in January 2011.

JSTOR said they did not want to press charges [2], however MIT has made no such statement (we can’t know for sure what either of these organizations was saying to the prosecutor behind the scenes).

The government pressed charges anyway. Multiple felony charges adding up to the possibility of 50+ years in prison and $4 million in fines.

For copying documents that any academic institution (including, Harvard, where Swartz held an affiliation with a fellowship! [3]) already license ‘all you can eat’ access to from JStor.

I heard about this situation and was outraged, but then forgot about it. I figured eventually there’d be a big campaign in defense of Swartz, and I’d donate some money and sign some petitions and reblog the calls for support when it happened. Such an organized campaign never materialized, I’m not sure why. The well-known lawyer Lawrence Lessig, a friend of Swartz’s, suggests that Swartz was “unable to appeal openly to us for the financial help he needed to fund his defense, at least without risking the ire of a district court judge.”.

On January 11, 2013, Swartz killed himself.

Swartz’s family writes

Aaron’s death is not simply a personal tragedy. It is the product of a criminal justice system rife with intimidation and prosecutorial overreach. Decisions made by officials in the Massachusetts U.S. Attorney’s office and at MIT contributed to his death. The US Attorney’s office pursued an exceptionally harsh array of charges, carrying potentially over 30 years in prison, to punish an alleged crime that had no victims. Meanwhile, unlike JSTOR, MIT refused to stand up for Aaron and its own community’s most cherished principles.

Attorney Lessig writes, in his blog post entitled Prosecutor as Bully:

…the question this government needs to answer is why it was so necessary that Aaron Swartz be labeled a “felon.” For in the 18 months of negotiations, that was what he was not willing to accept, and so that was the reason he was facing a million dollar trial in April… And so as wrong and misguided and fucking sad as this is, I get how the prospect of this fight, defenseless, made it make sense to this brilliant but troubled boy to end it.

* . * . * . * . *

Swartz’s alleged actions may or may not have violated criminal law[1]; the ethics of Swartz’s actions are very debatable (legal or not,  intentional direct action in a principled violation of the law can still be ethical; but there are certainly arguments that in this case his actions were not); but  in any event his actions, certainly in retrospect, seem not to be at all strategic or wise.

But what his actions were not is the kind of “Ocean’s 11” larceny and/or terroristic attack that the government tried to paint them as. And there’s very little room for dispute there.

Lessig writes:

From the beginning, the government worked as hard as it could to characterize what Aaron did in the most extreme and absurd way. The “property” Aaron had “stolen,” we were told, was worth “millions of dollars” — with the hint, and then the suggestion, that his aim must have been to profit from his crime. But anyone who says that there is money to be made in a stash of ACADEMIC ARTICLES is either an idiot or a liar. It was clear what this was not, yet our government continued to push as if it had caught the 9/11 terrorists red-handed.

Librarians and libraries know how the market for scholarly publications works, and know well that suggesting that Swartz “stole property” worth “millions” is ridiculous.

  • There’s little market for selling such a document collection without legal authorization — the institutional customers willing to pay the kinds of prices one pays for a JStor license (largely set by publishers not JStor, it is true) aren’t going to stop paying JStor (and other aggregators and publishers) in favor of illicitly pirated content.
  • And the third world markets which can’t afford to license content from JSTor and other aggregators? The dirty secret libraries know is that this content is already successfully — but quietly — being pirated on a regular basis. Every university library, and most hosting content platforms, can find evidence of such unauthorized acquisition, by drib and drab as well as in bulk,  if they care to look for it.   (See for example, Heather Tones White, “Electronic Resources Security: A look at Unauthorized Users” in the Code4Lib Journal.) We know that people regularly take advantage of our networks to ‘pirate’ scholarly articles for overseas markets, and the nature of our infrastructures and capabilities give us little means of preventing this.
  • And we know that, JStor licensees (nearly every university) typically have unlimited and non-metered access to licensed JStor collections, putting the portrayal of bulk downloading as massive million dollar ‘theft’ — even if this kind of bulk downloading was prohibited by JStor terms of service — in the farcical context it deserves.

Librarians and libraries have professional knowledge that portraying Swartz’s activity as a million-dollar-plus profit-movitated larceny, and prosecuting it as such, is ridiculous. And librarians and libraries know that the inequity in access to scholarly content that offended Swartz is a real problem. However misguided his approach to addressing the issue, Swartz was on our side — or at least, we should have been on Swartz’s side, writing the prosecutor and court with our professional expertise that this was not the sort of crime it was being portrayed as.

Libraries and librarians should have stepped up to defend Swartz publically. But largely they didn’t. Both Lessig and Swartz’s family hold MIT  accountable for, unlike JStor, refraining from publically stating they did not want charges filed against Swartz.

Were the librarians of the MIT library arguing in defense of Swartz behind the scenes? I have no way of knowing.

But I know that few other libraries or librarians were standing with Swartz, and we all should have been, and we largely did not, and it’s a shame. [3.5]

* . * . * .  * . *

Lessig writes:

For remember, we live in a world where the architects of the financial crisis regularly dine at the White House — and where even those brought to “justice” never even have to admit any wrongdoing, let alone be labeled “felons”.

We live in a society and with a system of laws that prioritizes — above all other concerns or values — protecting the ability of private businesses to make a dollar off the public. In this case, for publishers to profit off of ideas and words they claim are their property,  in a world where there’s virtually no commons left and everything is someone’s property. A system that prioritizes private profit above any public value in equitable access to information, and prioritizes an apparent desire to to send some kind of perverse lesson about the value of privatized profit above justice, above proportionality, and above individual’s lives.

The priorities and values of those in power are broken, and not just when it comes to intellectual property — but the area of intellectual property is where libraries operate.  And it’s in this area in which libraries — both public and academic — have a history of speaking out for and acting to create equitably distributed access to research and information.

I am honestly not sure libraries are going to exist anymore in a couple decades. The information ocean in which libraries swim has been changing drastically, I think we had a limited amount of time to learn how to swim in this new ocean, and I think our time may have run out without us rising to the (very real and difficult) challenge — we have not succeeded in making our place in the digital environment, and it may already be too late for us to catch up before we are reduced and eliminated by our hosting and funding organizations.

It’s not that libraries aren’t needed anymore, they still are. In fact, we may be needed more than ever;  in a society and economy where information is more important than ever, libraries — public and academic both — are, I will say again, the only institution specializing in information whose interests, business plans, values, and missions are in expanding access to information without a profit motive of our own, who can act institutionally with interests fully aligned with those of our users and the public at large — if we remember our historical values and accept our responsibility to do so.

Libraries are the institutions with the ability and responsibility to sound the alarm that the digital divide in access to scholarly output is growing, not shrinking.

But are we doing so? We can indeed find university librarians and libraries talking about how increasing scholarly publishing prices are imperiling our ability to provide access to our own users, who — as scholars — do after all generate the ‘content’ in the first place. But how often do we talk about those left outside our walled gardens entirely, how access to human knowledge has been sequestered behind paywalls, financially inaccessible to the “law-abiding” public at large, especially in the developing world?

As libraries, do we have a unique role, responsibility, and power here? If so, what would it look like for libraries to take a stand?

There aren’t obvious answers: What power do we have, when we’re economic hostages to the publishers too, and are struggling to be perceived as relevant just to our direct constituencies? And, anyhow, what chance is there that our administrators will even share these values or find them a priority, in an increasingly privatized, de-funded, neo-liberal environment where libraries are increasingly expected to ‘entrepenurially’ turn a dime off their users somehow too?[4]

I don’t know, but I know it starts with speaking up.  We ought to be willing to  take at least a fraction of the risks — organizational, professional, and personal —  that Swartz did in acting for equitable access to scholarly output (if hopefully in more strategic and successful ways) — and if we’re not, we ought to at least be speaking out in defense of those who are, like Swartz was.[5]

It might not save libraries, but it’ll help keep libraries worth saving.

Can’t we at least go out fighting?


[1] Laurence Lessig, who it should be said was both a friend of Swartz’s and a knowledgeable attorney, wrote:

Even if the facts the government alleges are true, I am not sure they constitute a crime. There is considerable uncertainty in this area of the law. Many wonder about the quick conversion of terms-of-service into criminal prosecution. But that’s a question the courts will ultimately have to resolve.

http://mediafreedom.org/2011/07/larry-lessig-responds-says-swartzs-alleged-actions-crossed-ethical-line/

[2] While JStor did not want charges pressed, they still published a statement (taken off the net sometime between two days ago when I first found them and today; thanks Internet Archive for preserving a copy) misleadingly implying Swartz’s actions as a kind of theft they simply were not. Such as confusingly claiming that they considered the situation resolved because they had “secured from Mr. Swartz the content that was taken” — a nonsensical claim when talking about digital documents which can have unlimited copies made of them at no expense, but one which helps build a narrative considering the actions as if they were a theft of physical property, which is unavailable to it’s rightful owner until returned.

[3] Why did Swartz use MIT’s network instead of Harvard’s, where Swartz held a fellowship? Perhaps because MIT’s network lacks the basic security protections almost any other university network would have against the kind of approach he used. See http://unhandled.com/2013/01/12/the-truth-about-aaron-swartzs-crime/ . Perhaps because MIT’s ‘hacker’ culture in which such clever security intrustions were often considered as entertaining pranks.

[3.5]. There are certainly some exceptions to lack of attention to Swartz’s case, such as Nancy Sims excellent piece in College and Research Libraries News.  And it should be said, that JStor, perhaps ironically, is making a bit more effort at increasing access to scholarly content than most of their peers in scholarly publishing and aggregation — especially sadly and ironically including an initiative obviously already in development announced just this week. (Was it in development before Swartz’s 2011 bulk download, or did he help shame them into it? I do not know).   Whether or not  JStor, a non-profit aggregator which itself must license it’s content from the publishers,  is doing enough to increase accessibilty of scholarly output (I think few if any of the institutions involved in academic publishing and dissemination are) — they are hardly an example of the worst, the most greedy, or the most culpable. (Anyone in the industry could make suggestions as to who would be at the top of that list, and we’d probably come up with much the same list). 

[4] Where libraries are unique, as i’ve said several times in this essay, is in their business model of acting on behalf of our users, instead of trying to make money off of them. If our funders try to turn us into just another business with a profit motive, we’ll be just like everyone else but not as good at it as them, and surely sign the papers on our own dissolution.

[5] From a Journal of Higher Education article:

“What Aaron Swartz did was a clear violation of the rules and protocols of the library and the community,” says Christopher Capozzola, an associate professor of history and acting associate dean of the school of humanities, arts, and social sciences. “But the penalties in this case, and the sources of those penalties, are really remarkable. These penalties really go against MIT’s culture of breaking down barriers.”

And John H. Summers, a “historian and former Harvard lecturer”, quoted in the same article:

“What Aaron’s case begs us to remember is that universities are supposed to be public, not-for-profit institutions,” Mr. Summers says. “They owe a standing moral debt to the public.”

I suggest that in issues of access to research materials, university libraries and librarians professional role and responsibility is to act as the conscience of the university, reminding their hosting institutions of that moral debt and responsibility, and of our institutional academic cultures of breaking down barriers.

12 thoughts on “Library values and the growing scholarly digital divide: In memoriam Aaron Swartz

  1. Reblogged this on Viv's Academic Blog and commented:
    Another interesting article about open access in the digital era. Includes the particularly alarming comment that “I am honestly not sure libraries are going to exist anymore in a couple decades.”

  2. Thank you. We all could and should have done more. Aaron paved a way for us to be creative and bold. We can, at the very least, honor his memory by stepping up and not being so fearful as a profession.

  3. One piece of action we could take is we could buy back science. There are at least 5 million scientists in the world, and many millions of science supporters that donate to charities, cancer research, and so on. Let’s say at least 10 million people all in all. If we can get even 50% of this population to commit to buying back science at, say, $50/mo (so $250M/mo), it would be possible to acquire large equity stakes in Elsevier and other publishers. I think the entire science community could be united to do this. There really isn’t a single person in the scientific community who thinks the current situation makes any sense.

  4. Insights from Richard M. Stallman and Eben Moglen:

    -The Right to Read : Richard M. Stallman

    Stallman paints a scary future where only authorized people may read something, and it is illegal to allow others to read it.

    -Misinterpreting Copyright—A Series of Errors : Richard M. Stallman

    “Copyright is … an artificial concession made to them [the authors] for the sake of progress”

    -Science Must ‘Push’ Copyright Aside : Richard M. Stallman

    “It should be a truism that scientific literature exists to disseminate scientific
    knowledge, and that scientific journals exist to facilitate the process. It therefore
    follows that rules for use of scientific literature should be designed to help achieve
    that goal.”

    -Are we currently witnessing the end of copyright as we know it?
    from Eben Moglen on Facebook, Google and Government Surveillance
    start at 14:25
    “We need copyright law to change, we need to get out of the environment in which someone is yelling thief, thief, thief, all the time. and we need to reassure young people who want to do journalism that they don’t need a paywall around the universe in order to grow up and own a house and have a car, and raise a family, in order to do that we’re going to have to confront some untruths on both sides, on one side … “

  5. Thank you for this. As an unaffiliated scholar, the only way I can get to articles for my research is through an informal agreement with my alma mater that would probably not stand up in court. And, Aaron should still be around, drinking coffee from a “I solemnly swear I am up to no good” mug.

  6. Scholars can already upload their own copy of an article on sources such as arXiv or ResearchGate, where it is possible to read materials that are also behind subscription pages. There are very few academic articles that cannot be found in free form from an open service. (books are more difficult)

  7. C. Wagner: The way _most_ journal publishers work, the author gives either copyright or an exclusive publishing license (limited time or perpetual) to the publisher. The author can not in fact legally upload the final published version of their own article on the public web. Doesn’t mean many of them dont’ do it anyway.

    What Swartz allegedly bulk downloaded was in fact a collection of journal articles. But yes, an ironic thing is that many of them (I’m not sure it’s true that “almost all” of them as you suggest) could already be found on the open web (not neccesarily legally).

    On Tue, Jan 29, 2013 at 3:39 PM, Bibliographic Wilderness

  8. In response to C. Wagner’s comment:

    “Scholars can already upload their own copy of an article on sources such as arXiv or ResearchGate, where it is possible to read materials that are also behind subscription pages. There are very few academic articles that cannot be found in free form from an open service. (books are more difficult)”

    As a public librarian without an .edu email, I cannot join Research Gate. ArXiv does not have the type of information I look for, which is the history of libraries in America as well as children’s literature research. Those articles are mostly only available on paid subscription services.

    Books I can get, because of extensive and wonderful ILL systems. As more and more journals are digitized and taken off the shelves of university libraries, I reiterate my previous claim: it is very difficult for the non-affiliated scholar to gain access to journal articles.

Leave a comment