APIs and vendor lock-in

Eric Lease Morgan asks on code4lib:

I heard someplace recently that APIs are the newest form of vendor
lock-in. What’s your take?

My reply (expanded a bit from my listserv post):

Standards-Based

When they are custom vendor-specific APIs and not standards-based APIs, they can definitely function that way. I’m still not sure if even a vendor-specific API is more or less lock-in than NOT having an API.  On the one hand, you will start to have software written against the vendor-specific API, that won’t work without changing it up if you switch vendors.  But on the other hand, with SFX and Umlaut, for instance, Umlaut does so much more than SFX, and the SFX adapter piece is such a small part, that in that case, for us at least, having SFX with an API and Umlaut on top of it it definitely makes it _easier_ for us to switch link resolvers without disrupting our services built on top of it.

Which we don’t do well at

But really, what you want is standards-based APIs, not vendor-specific APIs. That would give you the best of all worlds. There are a couple challenges that keep us from getting there though. One is that the library community, historically, is, well, pretty AWFUL at writing standards.  We come up with standards that don’t actually accomplish what they were intended to accomplish, are too complicated for anyone to implement right (on either producer or consumer side), and leave so much wiggle room that someone can claim they support the standard but not in a way that any other software will ever understand.  (NCIP anyone?)

Outside standards?

So there are a couple ways to try to get better at this. One is definitely looking outside the library world for standards to use. But unlike code4libbers, I don’t think (from my experience) that’s always possible or easy.  We have priority problems that, while they are not entirely foreign to the larger world, aren’t as high a priority for most of the non-library world, meaning they don’t yet have robust standards solutions. However, especially when standards are extensible (like XML ones often are), you can sometimes start with a general standard and extend it for the library space.

Standards based on, not preceeding, practice

Secondly, instead of creating standards before anyone has actually tried solving the problem the standard is meant to solve (as we often seem to do), the BEST standards are created by generalizing/abstracting from existing best practices. A buncha people try it first, you see what works and what doesn’t, you see what the actual use cases and needs are, you take the best out of what’s been done, and you standardize it.   But doing it this way means you need to go through a period of vendor/product specific (eg) APIs before you can get to the standard.  The library world is still immature in developing good software infrastructures, we’re going to need to through some more pain for a while, no way around it.

Vendor capabilities?

But another problem in all of this is that vendors may not have the interest OR the in-house expertise to actually provide standards-based APIs.  The APIs we often get now from vendors, frankly, are kind of kludgey, and do not fill me with confidence that the vendor actually has the proper staff or resources allocated to create good standards-based APIs — which, definitely, takes more time than creating a kludgey vendor-specific one-off.   Or maybe the vendor actually is dis-interested in this because they want lock-in.  Or maybe it’s just the case that the quality of your APIs doesn’t effect your sales at all, so it doesn’t make (short term at least) business sense to do it well.  (Heck, the _presence_ of an API has only just begun to effect sales, but libraries aren’t good enough at judging how good it is, that even a crappy API is probably ‘good enough’ for sales).

Open source, community work

One way out of this is definitely open source. We’ll work out the best practices and standards ourselves, and then we start insisting that vendors follow them.  The DLF-DI API is perhaps one example of an attempt at this, created from a generalization of the experience of library developers.   But the library developer community is also small, and generally fairly in-experienced. Creating APIs is done best by experienced developers who understand what’s going to make the API useable or not.

But, anyway, one step at a time. I firmly believe that even vendor-specific kludgey APIs are better than no APIs at all — we learn how to do better by trying.

Consuming applications

It’s also worth pointing out, as some subsequent commenters on that thread did, that the application consuming an API bears some reponsibility here. As much as possible, you need abstract out the API connector code, so you can easily switch the app to use multiple APIs, so long as they all have more or less the same data/capabilities (something which certainly isn’t guaranteed, admitted). This too takes more time, but is do-able. Among the software I work on, Umlaut manages to do it pretty well, Xerxes does not. This is in part because of the more focused and limited function of a link resolver compared to a federated search engine, made it easier to do with Umlaut. And I guess half of the SFX API more or less is standards-based: OpenURL.

As a result, even though both SFX and Metalib have vendor-specific APIs, our use of the SFX API, in my opinion, lessens our vendor lock-in, while our use of the Metalib API increases it.

In this case, this was mostly due to factors outside our control. But it also can definitely depend on how well you’ve architected your client code, to abstract out the API connectors. Sometimes I feel like this is heresy in code4lib with it’s “just get it done” ethos, but good, well-architected code matters.

This entry was posted in General. Bookmark the permalink.

3 Responses to APIs and vendor lock-in

  1. MJ Suhonos says:

    Excellent points Jonathan. Your point about abstracting connectors was exactly crucial in the design of Lemon8-XML, specifically because it allows the consuming application to easily add new external APIs, as well as drop ones in the future if need be.

    As for the role of standards, I think one good way of future-proofing is to build your internal data model (or whatever you’re using) around a standard (in our case, OpenURL), and then map that in your connectors to/from the vendor APIs. That way, you’re neutralizing the lock-in somewhat by mapping data to common (open) ground.

  2. jrochkind says:

    A common ‘data dictionary’ or ‘data vocabulary’ to base the internal data model on is definitely a key point. Good point MJ.

    Although I’d question OpenURL as a good example! You realize that “OpenURL” isn’t a data dictionary like you mean at all, right? OpenURL can contain arbitrary defined “metadata formats”. You probably mean the “scholarly formats”, but then you have to realize that really means 3 or 4 formats that are formally defined, with you abstracting the common elements out of each.

    (If you don’t understand what I mean–try to find me a standard specification of OpenURL that lists the data elements you are using in your internal data model. You won’t be able to. You will be able to find a list of about 20 formats here. No single one will represent your internal data model though, it’s probably a combination of book, dissertation, and journal, maybe with some dc thrown in too. And don’t forget that officially each of those is defined twice too!)

    Now, that’s what I did in Umlaut too, it’s not an unreasonable thing to do. But it’s not exactly a well defined data vocabulary.

    So good principle, but I’m not sure OpenURL is a very good example.

  3. MJ Suhonos says:

    Gah! I should’ve thought twice before I wrote that. Yes, of course you’re right — OpenURL is a protocol, not a metadata schema. I tend to loosely refer to the combined (scholarly) format simply as ‘OpenURL’ among the PKP group, but what I was really trying to convey as an example were the KEV formats that you so thoughtfully reference. So, thanks for catching me here.

    I agree that abstracting a set of common elements isn’t a very well-defined approach (although it works well in get-it-done practice), but then I guess that’s why they call them *application* profiles. Every vendor is likely going to have their own elements (identifiers like the ASIN are a good example), but there’s probably a standard profile that matches the data fairly closely — in the Amazon case, one of the OpenURL book KEV formats would probably map well.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s