Umlaut 3.1.0 released, with new Bootstrap-based visual design

I like to be confident that open source code I wrote is pretty stable and robust before recommending that others use it.

So I usually try to run any new code, or new versions of existing code, in production myself for a couple weeks before actually releasing it as a stable release.

I’ve been running Umlaut 3.1.0 in production for a couple weeks now. Some minor problems found by reviewing the logs for uncaught excpetions, and fixed. It’s ready for a release.

Umlaut is an open source aggregator of “last mile”services, working with your link resolver and other services to provide consolidated and efficient discovery/delivery service provision.

Umlaut 3.1.0 has now been released. Please see the release notes, especially if upgrading from a previous version of umlaut.

The major change is a complete overhaul of the visual design, based on bootstrap, and small-screen friendly. Thanks again to Scot Dalton from NYU for the initiative to make the Bootstrap-based redesign finally happen.

umlaut_bootstrap

Posted in General | Leave a comment

on the internet, and power

Bruce Schneier writes on how our internet lives are frequently dominated by a few huge internet companies with immense power over those internet doings:

There are a lot of good reasons why we’re all flocking to these cloud services and vendor-controlled platforms. The benefits are enormous, from cost to convenience to reliability to security itself. But it is inherently a feudal relationship. We cede control of our data and computing platforms to these companies and trust that they will treat us well and protect us from harm. And if we pledge complete allegiance to them — if we let them control our email and calendar and address book and photos and everything — we get even more benefits. We become their vassals; or, on a bad day, their serfs….

…So how do we survive? Increasingly, we have little alternative but to trust someone, so we need todecide who we trust – and who we don’t — and then act accordingly. This isn’t easy; our feudal lords go out of their way not to be transparent…

In the longer term, we all need to work to reduce the power imbalance…   We need to balance this relationship, and government intervention is the only way we’re going to get it.

I’ve also been thinking a lot about trying to create more cooperatively controlled internet infrastructure as a way to balance this power and bring (economic) democracy to the internet.

And, with regard to libraries, in many of our fantasies the institution of libraries, collectively, would be a force in internet life, a civic, public sector, decentralized but massive in aggregate counter-balance to the ‘feudal’ internet companies.  If libraries can find, keep, and expand a sustainable role as internet actors.

Posted in General | Leave a comment

Scientific publishing has some problems beyond business models

From an open letter in the Guardian:

 Early in their training, students learn that the quest for truth needs to be balanced against the more immediate pressure to “publish or perish”….

…This publishing culture is toxic to science. Recent studies have shown how intense career pressures encourage life scientists to engage in a range of questionable practices to generate publications….

…At the same time, journals incentivise bad practice by favouring the publication of results that are considered to be positive, novel, neat and eye-catching. In many life sciences, negative results, complicated results, or attempts to replicate previous studies never make it into the scientific record. Instead they occupy a vast unpublished file drawer….

As academic librarians, our role is to be experts — not in any specific field — but in the phenomenon of academic publishing in general.   In our educational role with students, we ought to be helping students understand and think about these issues — to problematize and complexify the world of research publication.  Despite our patrons desire to have blacks and whites that let them complete their assignments with as little thinking as possible (yeah, I said it) —  it’s our professional duty to not only help them complete their assignments as conveniently as possible but also understand problems and current issues in academic publishing in general.

And to make the case to administrators and faculty that this our rightful role.  Not all faculty will welcome critique of the scholarly publishing enterprise that is essentially their livelihood either of course (go read that letter in the Guardian we began with, again).

This reminds me again of Karen Coyle’s excellent points about the phenomenon of “predatory publishers” – we over-simplify if we suggest that publications can easily be split into problem-free ‘good’ and untrustworthy ‘predatory’ problems.

We do disservice to our patrons to imply that as long as they steer clear of identified ‘bad’ publishers from some librarian-endorsed list, then of course anything that’s “peer reviewed” becomes absolutely trustworthy gospel through the magical transubstantiation of ‘peer review’.

And we do disservice to ourselves and our professional capacities to avoid critical engagement with our domain of expertise — academic publishing.  We are — or ought to be — academic professionals, not just clerks and secretaries for the university community or salespeople for scholarly publishers.    Could it help restore professional credibility and respect to librarians if we participated at the front of research into research, of the history and critical analysis of the enterprise of scholarly publishing?

Posted in General | Leave a comment

Take control of delivery and access with Umlaut

In a recently published editorial in ITAL, Services and User Context in the Era of Webscale DiscoveryMark Dehmlow writes:

A major issue that continues to confound me is the lack of fully integrated request and delivery services that many discovery systems lack. Of course, all of them implement full text linking to every online article that they can create a link to, but as the sphere of scholarly data stretches beyond just articles, library print collections and delivery services have continued to be neglected primarily because implementing those services in an intuitively integrated way, beyond the “link to your old OPAC” methodology, remains a complex task. My main concern with this deficit is that there is a significant amount of scholarly material only available in print and to focus primarily on
electronic access limits the ability of our users to perform comprehensive research and reduces access to significant resources and services that libraries provide.

The open source Umlaut software (for which I am principal developer) has been aiming to fill this gap for over 7 years now, aiming to provide an aggregated and integrated path to delivery and access cross-cutting library departments, systems and services, accross the entire library business.

To be sure, Umlaut is not a magic bullet.  It’s more a platform to design the best solution you can in your actually existing infrastructure.  To make the most of Umlaut requires local developer time and creativity to figure out how you can use it to tie together your various systems and services as seamlessly as possible.   And a typical lack of good integration API in much of our existing (proprietary) infrastructure is an added challenge, generally increasing cost/time of developing a good solution.  But Umlaut is designed to be a platform supporting local solutions to integrating delivery, access, and specific item services — giving you the common skeleton on which you can hang your custom local functionality.

I agree with Dehmlow (and others I know I’ve read essays from but can’t find now) that the ‘last mile’ of access and delivery ought to be a priority for libraries — among other reasons, because access and delivery of the mountains of content we still have that is not both online and freely available, is something that we uniquely provide to our patrons, with much less ‘competition’ than for search and discovery services.  If our services aren’t good, our patrons don’t have other options (such as Google) to get (eg) printed monographs for their research (without just buying them).

And, at this stage in the development of our technological infrastructures, this is not something that a proprietary vendor-provided open-the-box-and-turn-it-on solution is going to be able to do well. Integrated access/delivery necesarily involves cross-cutting multiple pieces of local enterprise software (catalog, ILL, local identity/SSO, and that’s just the start) and policies (can you request locally held books to be delivered to your office? Does it depend on who you are and where the book is?).  It requires custom local policy and integraiton logic. It’s not going to be feasible/economical for a vendor to provide one-size-fits software that actually works well in this arena.  So I’m not as

At least, until your entire library enterprise infrastructure comes from one vendor and consists of an actually integrated single-business cloud platform.  This does seem to be where the industry is heading and what, for instance,  OCLC, Ex Libris, and Serials Solutions are trying to provide.  I know some of these vendors are trying to provide integrated ‘last mile’ services taking advantage of the consolidated integrated cloud infrastructure they provide — although it’s seldom highlighted as an advantage in their marketting, perhaps because most library customers aren’t yet seeing what an advantage it is, what a stumbling point this is for our patrons — where we should be uniquely distinguishing ourselves as able to provide seamless delivery/access, we’re instead just again showing our patrons our ability to provide them with a disjointed, inefficient, frustrating, confusing, experience.

In the meantime, there’s Umlaut, to help you try to stich together a pleasant and fast delivery/access experience.   It hasn’t received quite as much attention in the academic library world as I would hope — I think that’s in part because administrative decision makers have not realized the importance and benefits of improving our ‘last mile’ services, and certainly standing up an Umlaut at your institution does take some local development resources.  However, in addition to my place of work, NYU and Vanderbilt have been using Umlaut for a while.  Recently, I’ve heard of potential interest from several other large research university libraries.  I am hoping that at some point there will be sufficient critical mass of library developers using Umlaut that we can use the platform to take the ‘last mile’ to even greater levels of convenience and integration for our users than I’ve had the resources to do with Umlaut so far.

Posted in General | Leave a comment

how to make apache fake a 500 http response

For experimentation or testing (manual or automated, if automated usually captured by vcr), I sometimes need a URL guaranteed to always return an HTTP 500 error response.

Here’s some configuration you can drop in an apache conf to generate a simple default 500:

Redirect 500 /error500

Accessing http://yourserver/error500, or /error500/more/path, or /error500/more/path?with=query, all will return a 500 response with apache’s default 500 body.

The command is ‘Redirect’ becuase this is normally used to generate a redirect with “Location” header, so you can use it for mocking any 3xx with Location too (third argument, if present,  value of Location header), but it also works fine for mocking up other status codes like 500 or anything else, so long as you don’t care too much about what the body looks like.

Posted in General | Leave a comment

More affordable cloud hosting options

A few years ago when AWS was getting a lot of attention in the library world (and everywhere else!), it immediately seemed to me, based on some back of the envelope calculations, to be likely unaffordable to libraries. And not a very good value proposition compared to our standard self-hosting — especially for academic libraries which can benefit from their host universities IT infrastructure, but in general, it wasn’t cheap.

Here’s a blog post arguing that in general, EC2 indeed isn’t a great value proposition – if you mostly need 24/7 instances. Where EC2 shines, of course, is it’s ability to “elastically” (the “E” in EC2) spin up and down instances on the fly, and pay hourly, to quickly adjust your provisioning for to the moment demand.

That’s for architectures that are horizontally scaled out a lot, and need to scale up a lot — not most of our library things, but perhaps increasingly more and more (if we succesfully get more competent and succesful!), sure. Although the author notes and provides some calculations showing that even this can be a dicey value proposition.

In that blog post, he mentions other providers with better prices, but not by name.

In the Hacker News thread, someone mentioned Digital Ocean. They seems to offer a service roughly analagous to EC2, but with far better prices — including hourly charging and instantaneous provisioning, so in theory supporting on-demand load balancing for highly horizontally scaled services, just like EC2.  It might still be hard for a university library to beat in-house hosting, due to having our existing university IT infrastructures where we probably don’t need to pay for or pay seriously reduced pricing for bandwidth, electricity, server room facilities, maybe even operations staff, etc. But if you do have a context where cloud hosting makes sense, it’s worth remembering that Amazon is not the only reliable/competent player in town, and is definitely not the cheapest — although it may be the most feature-complete.

Posted in General | Leave a comment

ActiveRecord: Atomic check-and-update through optimistic locking

In Umlaut, there is a database column that basically corresponds to “service_status”.  This gets set, in sequence, to “queued”, “in_progress”, and then “complete”. (Also some possibility of error statuses etc).

There is a point in Umlaut logic where it first checks to make sure a row is “queued”, then only if it is sets it to “in_progress” and executes the service.

The idea is that if it was already set to “in_progress” by someone else (another thread or another process entirely), we leave it alone, we don’t execute the logic, it’s already in progress by someone else.

And the problem with the original implementation was the race condition. First we fetch the model instance (an SQL query), and check it’s service_status. Then, only if it’s service_status is ‘queued’, do we set it’s service_status to `in_progress` and proceed to execute. But the race condition is clear here — in between the first SQL query to fetch, and the second to update, some other thread or process may have already updated it to `in_progress` and began execution, and now we have double execution.

The solution? Some form of “optimistic locking” using the atomic facilities of any rdbms.  Now, I’m not actually talking about the ‘optimistic locking’ feature built into ActiveRecord. You probably could use that feature here, but it requiers adding a special column to your db, rescuing `ActiveRecord::StaleObjectError` etc. The optimistic locking feature built into AR is potentially a powerful general purpose tool when you need to avoid concurrent updates to any column at all in many different scenarios.

But for this particular use case, there’s a simpler way to do it ourselves. We basically want to have a single SQL line that updates the column to `in_progress` if and only if it was already `queued`, in a single atomic rdbms operation, and then lets us know if the update happened or not. We can generate such an SQL using ActiveRecord 3 update_all. 

my_active_record_model = ModelClass.where( however_we_fetched_it )

num_updated =
  ModelClass.where(:id             => my_active_record_model.id,
                   :service_status => "queued").
             update_all(:service_status => "in_progress")

if num_updated > 0
  # we updated, execute
else
  # it did not have a service_status of queued, someoen
  # else beat us to it
end

  • Haven’t actually updated Umlaut yet, there are some annoying legacy issues in Umlaut that make this a bit harder to fix.
  • Note that update_all does not automatically update ActiveRecord `updated_at` columns, you can include those yourself if you want them in the hash of columns to update. `:updated_at => Time.now`.
  • This is also an example of why you need to know and understand SQL and rdbms even if you are using a good ORM like ActiveRecord. And I love ORM’s, but if you didn’t know SQL/rdbms, you wouldn’t be able to come up with a clear way to solve this race condition in terms of SQL, and then figure out the cleanest way to do that with AR.)
  • Some answers on the web to analagous problems to this one suggest using db transactions here. I suspect that transactions may not even be able to solve this problem, but even if there’s a way to do it somehow with transactions, it’s going to be messier. Transactions aren’t in fact the right tool here. A simple optimistic locking `update…where` is.
Posted in General | Leave a comment

A method to map from query to broad topic, and associated resources

Short answer: Take advantage of a facetted response on a search against a corpus that has controlled classification data.

From user query to topic, to resources on that topic

Andrew Nagy tells me in direct email that one of the new features in upcoming Summon 2.0 release is:

Topic Explorer – Over 50,000 english topics will be mapped to user queries on the fly and the API will deliver a “topic” that has an encyclopedia entry, recommended librarian, recommended subject guide, related topics, etc.

(This is described on the Summon 2.0 brochure webpage, although it wasn’t completely clear to me from the webpage that it took user queries as input to arrive at a topic).

This is along the lines of a feature that I’ve been thinking about for years — the ability to recommend appropriate subject resources (subject specializt librarians, subject guides, databases recommended on a particular subject) in response to a user-entered query in a catalog, articles, or other discovery search.

Have been thinking about it for years, as a way to get users to our librarian-recommended resources, but it’s become even more desired by some local librarians in response our recent move towards offering an integrated article search function, currently based on the EBSCOHost API,  in our local discovery UI as an alternative to directly going to individual licensed database platforms, and as a replacement for Metalib broadcast federated search.

So I think Summon is right on track here in their new feature development, which is nice to see, and less usual than it should be in the library proprietary software sector.  This is in some sense an expansion of the existing Summon feature to recommend subject-relevant licensed database platforms based on user-entered queries, expanding it to additional topic-specific resources.

While they say 50,000 topics, I assume there must be some hieararchy to their topic list, with things like institutionally specific librarians and subject pages assigned only to top-level broad topics — it would not be feasible to manually make specialist librarian assignments to 50,000 topics, of course.

So it’s really mapping to fairly broad high-level topics that matters for locally-assigned subject resources like librarians, subject pages, or subject-specific licensed database platforms.  (I’m guessing the SerSol feature may automatically map things like encyclopedia entries at the narrower, more specific elements of the 50k list, which might be neat, but is not what I’m choosing to focus on in this discussion).

The hard part about implementing a feature like this is mapping from arbitrary user query to topic (broad or otherwise).  Once you’ve done that, it’s of course an easy software problem to record URLs or other content that corresponds with each broad topic, and provide them to the user once a topic has been identified.

If you have Summon, and like it’s new “topic explorer” feature, great. We won’t know exactly how it’s implemented, but those with Summon licenses will be able to test it and see how effective we find it, once it’s released.

But what additional options might you have for implementing such a feature yourself, for institutions who do their own development in some cases?

Spider the web, use text mining techniques? — NCSU

Way back in 2007, Tito Sierra then at NCSU presented at the Code4Lib conference on an NCSU project called Smart Subjects. 

As you can see in the slide show there, Smart Subjects was also an attempt to map from a user-entered search query to one or more library subjects.

It did so (and possibly still does so) in a creative way. From existing bodies of text that can be easily classified by department (Course catalogs, departmental lists of published articles), harvest all that text (classified by department), and then index in a text indexing engine, that allows information retrieval relevance ranking techniques to take arbitrary phrases (user-entered queries) and see which academic department’s harvested text corpus has the best match to the query.

believe some years after this, Tito told me he wasn’t, in the end,  neccesarily super enthused with the quality of results attained by this method. In any event, it is a fairly heavy-weight method, with lots of moving parts to develop and maintain and fine tune.

Tito has since left NCSU, but I believe it’s what is still powering the subject recommendations at the bottom left of their “QuickSearch” results, although it’s unclear if the corpus ever gets updated. I don’t know if they ever thought to use it to power recommendations of actual library staff too, although there is a library staff member highlighted on the QuickSearch results page.

(Looks like the NCSU SmartSearch tool began in 2005, 8 years ago!)

Using your catalog corpus, with classification data, as a classifier?

I’ve been thinking for a while of another approach, lighter weight and taking advantage of the extensive person-hours of work that goes into our cataloging metadata. Although I haven’t had a chance to prototype it yet, I’m going to tell you about it anyway.

We have these extensive library catalogs. What if each record in the catalog had broad subjects assigned to it? (They don’t, really, but bear with me, let’s start here).

And let’s say we exposed these broad subjects in a facet. Then for a given query, you’d get a count of how many items in your result set (matching that query) were posted to each broad subject.

Say you search for “project management techniques”, and get back, in the facet based off these broad subjects:

  • Engineering (56801)
  • Business  (47920)
  • Computer Science (34000)
  • Health Science (24000)

That would potentially be a pretty good list of recommended subjects corresponding to the query entered, no?  Then, if you have subject pages, database lists, specialist librarians, etc., already categorized into these same subjects, you could recommend them to the user based on her query.

Now, our library catalogs do have classification data in them assigned to individual records, using vocabularies created and assigned through the hard work of many catalogers over many years. Is there a way to use this data for this purpose?

The common classification systems of Dewey and LCC both classify rather too finely for this use — we need to map to a vocabulary of dozens of topics/subjects/disciplines, so we can assign local resources to each one. Hundreds or thousands is too many.

But is there hieararchy in DDC or LCC that would let you “post up” from finer-grained specific classifications, to more broad classifications useful for our purpose here? DDC might have, but I don’t have many DDC records in my local corpus, and haven’t spent much time with DDC. LCC is known to be less hieararchical than DDC, but there are still ways to get some broad classifications out of it, using the top-level schedules. But it’s tricky to make this work, and the broad categories you end up with aren’t neccearily as useful as we’d like.  (See the “Discipline” facet in our own Solr-based catalog, which is constructed from LCC “posted up” into broad classification. For “project management techniques”, the top Discipline facets are “Technology”, “Science”, and “Social Science”, which aren’t neccesarily wrong, but also aren’t as useful as we might like, they are too broad and somewhat archaic.)

The University of Michigan High Level Browse classification

The University of  Michigan has developed their own High-Level Browse (HLB) classification.  One of the main uses for this classification is indeed a broad classification facet in their catalog search.

The U of M HLB is conveniently based on LCC, and U of M maintains mappings from LCC call numbers to their own HLB classes. Which is what makes the facetting work in the first place, for any corpus with LCC classifications on items.

They’ve developed their HLB based on their own schools, departments, and programs at U of M.  You could try to develop the same locally. But it’d be a lot of work. And U of M awesomely shares their classification, with LCC mappings, in XML form too. So you could just write software to download theirs, use it in indexing into your own Solr catalog index, and get U of M HLB facets in your catalog too — and use them to power a subject recommender too.

Any large research university probably has academic classification needs roughly similar to U of M’s, although there will certainly be special programs you wish were represented that aren’t (or that are in U of M’s, unneccesarily for you), but it will likely be good enough, if you don’t have the resources/organization to develop and maintain your own local classification. (I’m amazed U of M even pulls it off, honestly.)

Let’s give it a try, go to U of M’s catalog and do a search, and check out the top categories represented in the “Academic Discipline” facet in the sidebar.  For “project management techniques”, it’s Business, Management, Business (General), Social Sciences, and Engineering.

If a system made recommendations for subject guides, specialist librarians, subject-relevant databases, and other subject resources, based on those classifications… they’d be fairly relevant to the query, right?

Do some of your own queries, how well does it work?

(In the U of M HLB, there are still a few levels of hieararchy, all of which may be represneted in the facetted result. For instance, “Business (General)” is a sub-category of the more general “Business”. Inter-mixing them both is probably appropriate for facet response, but for making subject recommendations some experimentation is called for as to when to use more-specific and when to use more-general, and when to de-dupliate when  a super- and sub-class are both represented in the potential ‘best subjects’)

Not just for catalog searches

The idea is to use the catalog as a classifier, but that doesn’t mean you can only use it for catalog searches.

For any search in any system you control enough to add custom features to — you could add a feature based on the catalog as a classifier. Even if they are searching in a non-catalog article discovery system — the software could still, behind the scenes, take the user’s query, execute it’s own under-the-hood query against the catalog, look at the facetted broad subject results, and use them to make subject recommendations.

Not necessarily just with your own catalog

Likewise, there’s no reason you need to use your own local catalog as the classifier. Any catalog will do — if it can provide a facetted response of broad subject classification, has an API such that you can use it in this way, and the operators don’t mind you using their catalog in your service.

WorldCat would be great, if OCLC added broad subject classification facet, and an API to retrieve such.  Umich’s catalog, already using their own in-house HLB classificaiton, might be convenient too.

Of course, if you do add umich’s HLB broad subjects as a facet in your own local catalog, your users get the advantage of using that facet directly for their catalog searches too.  (Assuming you have enough control of your local catalog to add such a thing, for instance becuase you’re catalog is based on Blacklight, VuFind, or another tool using a local Solr your control).

Idea worth exploring?

I’m not sure when/if I’ll have time to investigate this idea, although I probably will eventually. But I absolutely don’t mind if someone else runs with it and beats me to it — as long as you share back your findings, how well it worked, etc.

Posted in General | 7 Comments

One scenario for the death of the academic library

My last post has attracted some interesting discussion. Eric Hellman, in the comment thread, recommended this very interesting recent article, Open Access, library and publisher competition, and the evolution of general commerce, by Andrew Odlyzko, 2013. 

I recommend the entire article heartily, but provide some extensive pertinent excerpts here, with some commentary.

The ARL has statistics showing library budgets as fractions of total university budgets for a sizable collection of their members [10]. The chart for the 40 members that have reported since 1982 shows an inexorable decline in this ratio, from about 3.7% to a bit under 2.0%…

…The share of library budgets that goes out in purchases of books, journals, and databases has grown substantially, from 33% in 1990 to 42.5% in 2010…  Further, all of this growth is accounted for by serials. Books and other materials have just about held their own (with books shrinking at the expense of the rest).

In my last post, I asked, at what percentage of faculty or other univeristy community members thinking the library is not worth what’s being spent on it — would result in decreasing library budgets as a percentage of host institution budget.

It turns out, library budgets have already been declining as a portion of  total university budgetsand within the library’s budget collections have been rising as a portion of library budget. Meaning the decrease in spending on library staff, professional and otherwise, has decreased even further as a proportion of university spending than library budgets in total.

Perhaps, in fact, that point I asked about has already been reached.

There are many interesting statistics at [9] demonstrating decline of the traditional functions of libraries. Thus between 1995 and 2010, the number of students at ARL institutions grew by 33% (with the ranks of teaching faculty and graduate students climbing 15% and 43%, respectively). The only category of library services involving physical material that showed growth was interlibrary loans, which climbed 92%. This reflects libraries concentrating their budges on serials, and giving up on trying to keep up with the growth in the number of new books being published. In other categories, initial circulation (i.e., excluding renewals) of physical volumes dropped by 42%. Thus it is a gross exaggeration that “nobody uses the library anymore,” as one sometimes heard from faculty or students. But the decline in borrowings per student by more than half is telling. What is perhaps most surprising is that the number of requests for reference assistance dropped by 66% in absolute terms, as is shown in Fig. 6, and thus by about 75% on a per-student basis. This is certainly a core competency of librarians, and they are great at navigating the torrents of electronic information, as well as providing guidance to the use of traditional printed sources. However, it appears that Google, Wikipedia, publisher databases, and the like are “good enough” for most scholars, and that the convenience of around the clock access from anyplace outweighs the higher quality that librarians provide…

…The basic and very promising approach open to publishers is to continue marginalizing libraries by extending the reach and scope of “Big Deals.” The consortium model, in which groups of libraries cooperate to get access to a “Big Deal” is already common, and can be pushed further. The ultimate situation might be national “Big Deals,” where some toplevel bodies pay for access for everyone from a nation. Enlarging the “Big Deal,” especially through further mergers, but also by including additional information sources, can serve to create packages that simply could not be dispensed with. The most obvious move in that direction (which is already taking place to a small extent) is to make books, both current and old ones, a part of the “Big Deal.” (Recall that the process of digitizing old printed materials is extremely inexpensive.)…

I think an extremely likely scenario for the death of the academic library will be our hosting institutions simply paying vendors — whether publishers, aggregators, or other newer ‘disruptive’ businesses — directly for services (both the content itself, the platforms that host and organize it, and the ‘discovery’ services to search it), needing only a local skeleton staff to handle licensing.  If the bulk of a libraries budget goes simply to passing money on to vendors for large ‘big deal’ bulk packages, minimal professional staff, and not much of a library organization at all is required simply to do the bookkeeping and ordering.

Perhaps in the future, it will be clear that by 2013 this was already more or less a foregone conclusion to the story of the academic library – Odlyzko’s article shows some of the indicators and directions pointing in that direction.

What about libraries? They are handicapped in the competition with publishers by several factors, see [62]. One of them, that they have the bulk of the resources, and are thus a fat target, is a strength as well. At least in principle it makes possible revolutionary changes. In particular, as was shown earlier, just the external journal purchases of the ARL libraries alone could provide Open Access publishing for the world’s entire scholarly literature. Had libraries thrown their resources enthusiastically behind new, low-cost Open Access journals, perhaps the current scene and the unfolding future sketched here would have been different. But that would have required many research partners willing to put their energy into the enterprise (certainly a very doubtful proposition, given the inertia in the academic system), and the willingness of librarians to cannibalize their bread-and-butter operations. Certainly librarians present a classic case of Christensen’s “innovator’s dilemma,” pressed to maintain traditional services, and therefore slow to embrace new ones. As an example, digital libraries have been discussed in the library literature for decades. Further, the amount that ARL libraries spend in a single year on acquisition of serials would have sufficed, with plenty left over, to digitize all their standard books and journals that are out of copyright. Yet it was outside efforts, in particular the Gutenberg Project (the early pioneer, almost forgotten), Google Books, and the Internet Archive, that led the way…

…We also see libraries moving into other services, such as providing long-term storage for publications, data sets, and so on. However, there they are competing not just with publishers, who also see the opportunities, but also other organizations, such as campus information technology units, high performance computer centers, and a variety of new commercial startups. The opportunities are many, but so are the competitors….

Of course, people have been talking for over a decade about how the internet and other changes in the information environment will/may spell the death of libraries.  Some may be sense this as tiresome alarmism.

But I think now we’re actually seeing it happening. Many of our responses previously to this library apocalyptic thought was “Sure, traditional library services exactly as delivered may no longer be as important, but there is obviously an even greater need than ever for impartial information services and expertise for academic and civic communities, libraries can and will provide these services.”  But at this point, I think by and large we’ve seen libraries fail to rise to this challenge, which is why we’re seeing the  indicators of the beginning of actual, not just hypothetical future, sidelining of libraries in the university environment. 

If the library effectively ceases to exist as an organization providing information expertise to the university community, one thing our host institutions lose is an organization which can facilitate research/information needs from a perspective of interests aligned to those of the host university community.

The library is one of the few information organizations involved in research life that does not have business interests based on selling our users something (or selling our users’ privacy to someone else), or on convincing users to buy a particular product — but only on facilitating our users own self-directed goals and needs. Libraries can thus, uniquely in the information environment, provide services with transparency, impartiality, assertive protection of user privacy, and a professional ethical responsibility to act always in the interests of our patrons, never sacrificing them to our own business interests.

The existence of libraries as such disinterested advisors is thus extremely valuable in making possible the impartial non-market-based free inquiry at the idealistic heart of academic research and learning itself.  I firmly believe it will be a loss to the academy and to society to see libraries fade to irrelevance.

But that’s not going to be enough to save the library, in the current environment,  if we can’t also provide cost-effective services that not only satisfy (and we are barely doing that) but go on to delight and excite our host communities by what we can do to make their work easier, more productive, and more pleasurable.  A library’s impartiality in failing to deliver services of value is naturally of  limited perceived value to the host organization.

It will require some disruptive changes to our business as usual to get there — some close attention to our patrons’ changing needs, habits, environments, and preferences–and some creativity and risk-taking in attempting to position ourselves to engage our patrons. There’s no guarantee of success and we will inevitably make mis-steps along the way, but how many library organizations are even seriously engaging in the attempt, with all the disruptive risk and challenge it entails?

Posted in General | 7 Comments

Academic library existence at risk?

From the Ithaka survey of US Faculty , and library perceptions Figure 44 in full report.

“Percent of respondents agreeing strongly with each statement”

Because scholarly material is available electronically, colleges and universities should redirect the money spent on library buildings and staff to other needs

  •  2012: ~18% (results annoyingly seem to only be given in bar chart form, requiring me to graphically estimate numbers, sorry)
  • 2009: ~10%
  • 2006: ~8%

Because faculty have easy access to academic content online, the role librarians play at this institution is becoming much less important

  • 2012: ~20%
  • 2009: ~17%
  • 2006: ~4%

Around 1/5th of faculty surveyed agree with those statements in 2012. According to report narrative, even higher in the sciences, somewhat lower in the humanities.

What do you think those numbers will look like in 2015 when they run the survey again?

At what number (if not already) will the percentage of ‘strongly agreeing’ faculty (especially to the first one, ‘redirect money spent on library…’) result in lowered funding to libraries?

Because there is certainly some point, at any institution, it will, right?

Different institutions have different decision-makers for library funding, depending on public vs. private university, centralized vs decentralized, etc.   But in almost all of them, faculty opinion is going to have an effect on library decision makers, and when a substantial number of faculty think the library should have it’s funding reduced…. ?

While I think libraries ought to continue to have a huge role in university teaching, research, and culture — I think it’s indisputable that our role is,  in fact, lessening. And I think, at most institutions, faculty are right that the value they are getting for the substantial investment the university makes in the library… is getting smaller and smaller.  Less and less justified.

It’s not an issue of marketing, or just properly ‘branding’ ourselves.  (Or do you think that our marketing has gotten much poorer in the past 6 years, and that’s why the number of faculty thinking our budget should be reduced has doubled? Really?)

Our decades-old service models will not justify our budgets to our host institutions. The services we used to provide are, in fact, no longer as needed/valuable as they once were — no longer as succesful even in cases where what we’re trying to do is still needed and wanted, we’re failing at fulfilling those needs.

We will not survive by focusing on what we think our patrons need and ought to want, in contradiction to what our patrons say and believe they need and want. We will not survive by trying to convince them to want what we provide, but only by changing and coming up with new provisions that excite and delight them.

We need to change. We need to provide new and different services. We need to preserve some services, but significantly change the manner in which they are delivered.

And yes, that means we need to reduce and eliminate other services too. Change is hard. Yes, there are still some staff and patrons who are used to and rely on the services we’ve got now exactly how we deliver them now, and are going to be disrupted and upset by change.

But the number of patrons who think we are decreasingly relevant — and deserve a smaller share of the university’s budget — gets larger all the time.

When those numbers start effecting the relevant decision-makers, who start cutting library budgets as an overall share of the university budget (not just because overall university budgets are shrinking, which they are too) — our services will be reduced and eliminated then anyway.  Our staff and organizations will be cut, and in some cases even eliminated.

Insisting that what we’re doing really is valuable, and our patrons are wrong not to realize it — isn’t going to work (even if it were true, which I do not believe it is). We have to learn how to change faster and better, or we are not going to exist anymore.

The need for expert assistance in organizing and finding information is not going away, it’s only getting larger. There is — or ought to be – an important place for libraries in the contemporary university. But only if we learn how to provide the information services that our host institutions need today, not what they needed 20 years ago –  and are willing to seriously change up our game. Many of our library organizations are not willing to do this — in their practice if not in their leaders words, do not exhibit a willingness or capability to change. Those are going to be the organizations that disappear in the coming… decade?  There will come a point, if it has not already come, that it is too late to recover our value to our host institutions.

Posted in General | 19 Comments