using free APIs, exit strategies

So these days there are lots of really well-done, useful, free APIs around, especially from Google. The obvious invitation is to use them in your apps.

The potential problem is that a free API, which you have no contract or service agreement for, can disappear at any time without notice. I mean, really any third party service can disappear at any time (company goes out of business, is hit by a tsunami, whatever), but completely free services are even more at risk. The company could decide to stop providing the service; or they could decide to start charging for it; or they could change the ToS such that you no longer qualify (or maybe you were always violating the ToS but you or they just noticed); they could give you plenty of advanced notice for any of this, or they could give you no advanced notice for any of this, you could just find it stops working one day.

That’s not a reason not to use a really useful free API, but it’s probably a reason to think about, when you start using it, what you would do if it went away; if you have a way to compensate, how long it would take you to do so, or if you can get away with simply abandoning the service (hypothetically formerly) provided by that free API altogether. (And part of this is considering who is going to be around to deal with this when/if it happens; if you’re not still in your job, have you left enough documentation or training for your replacements, or your clients, to understand what’s going on and what to do?)

Anyway, that’s what just happened with a bunch of useful Google APIs, which prompts us to think about the issue.

None of these Google APIs will be going away sooner than 6 months from no (some will last as long as 2 years from now). Some have replacements (although not neccesarily feature-identical replacements), some don’t. It could have been worse, but it’s still going to be bad for some people.

The deprecated API that’s getting the most attention is the Translate API; going away in 6 months, with no replacements.  Apparently some people had built a business on software who’s core features relied on Google Translate.  If they hadn’t previously considered what they’d do if Google Translate API went away (possibly including considering and deciding that they’d just abandon that software and move on to somethign else in that case and they were okay with that)… well, hopefully a lesson has been learned, not just by them but proactively by others who now know they better consider such things.  The elimination of certain free Google API’s some people have been depending on gives us an opportunity to reflect a bit about how we plan our use of such services in general.

Google Books, and HathiTrust

I don’t use Google Translate API in any software (and now clearly never will), but I do use Google Books API, which has also been deprecated.

Fortunately, a replacement is available, which looks like it will still support my use cases, as I did some initial analysis of previously.  

Although with some inconveniences, and potentially some bigger problems. It’s inconvenient to have to re-write my code for the new API; but that’s just part of software development, although certainly especially so when using a free API with no contract or agreement. (I’ve begun referencing Ranganathan’s sixth law analogized for software: Software is a growing organism. Always. If you want your software to remain useful for a long time, you’re going to have to keep putting development into it, almost always.)  And the requirement for an API key is inconvenient, and the new rate limits may be prohibitive for me, although it looks like maybe you can get a free increase to your rate limits, will have to look into it.

So it’s going to take some elbow grease, but I plan to keep using the new Google Books API for my use cases. My use case is basically, taking a known item citation, and identifying search-inside-the-book or full text reading/downloading access to it.

But it’s enormously comforting to me to realize that even if Google Books were to take away it’s API from me entirely, I’ve still got HathiTrust.  Now, HathiTrust certainly isn’t a completely identical replacment — while HT has many of the volumes in Google Books at the same access levels, there are probably some in GBS that are not in HT. There may be some with increased access in GBS compared to HT. (But there are definitely some where the reverse is true). Google Book gives you search-inside-the-book with keywords-in-context in your hits even for in-copyright books they have no permission from the publisher to do anything special with; HT more cautiously does not. HT does not allow full PDF downloads of even public domain books if they were scanned by Google, because Google won’t let them. (to the general public; if you are a HathiTrust member, you get these downloads).  Google offers epub downloads too, HT does not. (yet anyway).

But if Google Books API were to become unuseable by me tomorrow, I’d still be able to provide the fundamental services (links to full text and search inside for many volumes) by way of HathiTrust.   In fact, my software is already consulting both HT and GBS for this, so I wouldn’t even have to make any code changes, even if it were to dissappear without notice, i’m still good.

This is really comforting, and I think shows the extreme importance of the HathiTrust effort, and the extreme foresight of umich for initiating it.  These services are too important to the future of our libraries to rely soley on a free API with a third party with  no contract. With HathiTrust, a cooperative owned and controlled by libraries has a say in it too.  I bet if GBS did go away, HathiTrust would have more motivation (perhaps pushed by it’s members) to make it’s services even better to compensate (within the limitations of it’s contracts with Google).

Of course, for many of us, HathiTrust too is just another free API we have no service level agreement or contract with! But better to have two of em than rely on just one. And if your institution joins HathiTrust as a member, now it’s a different more reliable relationship.

(At least in theory; many library cooperative software/technology projects seem to end up really awful design-by-committee monstrosities, with very little reliability at all.  HathiTrust seems to have avoided this so far, it’s really well-designed and implemented as a product; it’s probably not a coincidence that this is because it began as an initiative developed by one institution (with good software engineers), not as a design-by-committee distributed collaboration. Hopefully they can retain their quality even now that they are a more distributed membership organization).

Not Just APIs; Google Scholar

Over the past few years, I’ve seen it suggested a few times that, well, Google Scholar is so great for finding scholarly articles, and our attempts to provide such services (that cross vendor boundaries, as Scholar does) tend to be so weak, why should we spend resources on trying to provide such a service at all? (With broadcast search like Metalib, or with aggregated indexes like Summon, etc.)

Google Scholar isn’t perfect, and there are some important things that it does less well than our local solutions, but without getting into details, I’ll agree that overall Google Scholar is a much much better solution than Metalib or even Summon, despite it’s flaws it works better and easier for our users. Let’s just agree to that for now, for the sake of argument. (The only thing I will point out specifically is that Scholar’s lack of an API means we can’t do some of the coolest stuff we could do with a product with an API, but anyway).

Okay, so let’s say we stop spending resources (license fees, staff time, etc) on Metalib or Summon or any local tool for cross-vendor search of scholarly articles, and just direct our users to Google Scholar.

What happens when/if Google Scholar goes away?  It is indeed a free service, which we lack any contract or service agreement for.  Google as a company probably isn’t going away any time soon, but how likely is it they might decide to eliminate Scholar, if it’s not making them money? It’s really hard to say. It seems not that likely at this point, but it’s certainly possible.

And if it were the only way we had of providing our users with a way to search scholarly articles cross-vendor, and it did go away, it would be disastrous for us academic libraries. Helping users find scholarly articles is a core part of our mission/business. (In fact, I think many academic libraries misallocate their resources putting more into library ‘catalog’ discovery than journal article discovery, when it probably should be the reverse as far as our users needs are concerned).

I think this is way too core a function to our users research (you know, what our job is to support) to put ourselves in a position where it could away with no notice, and we’d have to start from scratch developing or purchasing an alternate solution. Going back to searching individual platform websites individually isn’t good enough. Although some users even now will choose to do that. And certainly some users now will choose to use Google Scholar, and other free resources — and our job is integrating those external free resources as well as we can with our infrastructure (a difficult challenge, surely), rather than discouraging them from doing so.  But our job also has to be maintaining a solution we do have control and reasonable expectations of longevity for. Yeah, we’ve got to do all of it, which takes resources, but that’s our job, this is a core research function.

This same analysis would apply to the similar argument: “Why do we need to provide a ‘library catalog’ at all, can’t people just use Google?”  I’m often unsure if people saying this mean Google Books specifically, or Google Web Search, or what. I’m not sure they know themselves, I get the feeling they aren’t thinking through the details, just vaguely hand-waving “Google!”.  There are a lot of tricky details in that plan, replacing ‘the catalog’ with ‘google’ isn’t exactly the same situation as replacing Metalib/Summon/etc with Google Scholar.  But one of the details remains — a service you have no contract or agreement or expectation of stability for, what happens if it goes away or stops working, what happens to your ability to meet your mission and satisfy your users, if you don’t have an alternative that was already running or can be immediately put into place?


6 thoughts on “using free APIs, exit strategies

  1. Concerning the Google Translate API, they do offer an alternative, the Google Translate Element, which is what I used in my catalog, and apparently that will remain.

    I do agree that librarians must consider the alternatives that tools not in their control will very possibly shut down. Still, that is just one of the challenges of living in the new information world–if we don’t use the free APIs that everyone else is using because it may shut down, we will be seen as backward Luddites, as indeed, we would be. It would be nice to be in complete control, like when we had card catalogs, and everybody either used our tools the way we decided, or they could all just do without. Even then, a vital journal index could close down or something and it was a relative disaster. Those days are gone and we have to learn how to live without many of the controls, focusing instead on disaster minimization.

    I guess the way I look at it is that we will be more or less forced to use the free APIs and other free sites if we want to be useful to our patrons (i.e. our communities), and if those tools shut down, while it will be a headache for us, it will be a disaster for the community. Google Scholar is very popular and that on its own makes it highly useful to Google–what would worry me more is if it became less popular. Still, it could be shut down. Google Books could potentially become even more important. It is obvious to me with Google’s rethinking of different tools; the APIs, shutting down Google Video and other tools, is evidence that Google may be suffering from the economic downturn as well.

    As I mentioned, the solution to private companies just shutting things down will be rough on us but disasters for our communities. That is where the solutions will be found, I think: getting the communities to ensure that essential tools will not be shut down at the whim of a manager. Libraries have little or no power in this regard, and while we can serve perhaps as points of reference, and provide information, the real power will have to come from the communities: professional associations, municipalities and governments.

    But no matter what, libraries will be very uncomfortably in the middle and will continue to lose control.

  2. There were many ‘communities’ that found Google Translate useful, that did not stop Google from withdrawing it.

    But yes, if things we rely on are shut down and we don’t have backup plans, it will be bad for us _because_ it will be bad for our communities, our job is supporting them.

  3. A minor point not central to your theme (which I do agree with) – I would compare GS to Scopus or Web of Science. Big index, no human indexers, no fancy metadata, across publishers. Not, to MetaLib, which searches across databases some of which have value added indexing.

  4. Christina: What I think GBS, Scopus, Web of Science, and Metalib all have in common is that they search accross publisher/full-text- aggregator boundaries, and the reason we use them is to give users a single consistent interface for searching citations, ideally as large a universe of citations as possible. Scopus and and Web of Science also add additional value added services on top (notably, citation counting/chaining). I guess Google Scholar does too, to some extent. But I think these all can be put in the same category with regard to what searching value they provide to users: searching accross a gigantic corpus of citations — accross vendor boundaries

  5. huh, there was more to my comment above which didn’t seem to make it.

    But one part was saying thanks for bringing ISI and Scopus into the comparison, now it occurs to me to think — could we direct our users to ISI or Scopus instead of Metalib or Summon? I wonder how ISI or Scopus compare to Google Scholar for breadth of holdings, quality of relevance ranking/search results, and ease of use to users.

    I seem to recall seeing a paper or two comparing Google Scholar to Metalib with user studies, googling for “metalib vs. google scholar” finds a couple.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s