It occured to me a while ago that Umlaut isn’t just a ‘link resolver front end’, or an ‘improved link resolver’. It is those things, but when you improve a link resolver enough, and pay attention to all forms/genres (not just journals), what you get is what I’m clunkily calling a Known Item Service Provider, an additional piece of library infrastructure.
I’ve come to think that this is in fact an essential tool that most library digital infrastructure is missing. As an infrastructural tool, it’s not neccesarily designed to answer just one question for one very particular use case, it’s designed to answer the general question (for people and machine access): “What can you tell me or do for me about item X”?
Andy Powell brings up a specific question/use case that’s a sub-set of this: If I know a print book I’m interested in, and likely even know it’s ISBN, does the library have a licensed ebook version? And secondarily, is there an ebook version in existence whether or not the library licenses it?
This is definitely something in Umlaut’s domain. How well does Umlaut do at answering it? Currently, the second one ‘does an ebook exist whether or not we license it’, not very well, but if external sources of data (with APIs) could be identified to answer it (as Andy begins doing), plugins to Umlaut could be written to grab those data and make Umlaut’s answer better for this specific use (and perhaps improve other unexpected uses too, since you’ve improved the infrastructural tool).
The first one, does the library have an ebook version, Umlaut does better at, at least at our library.
This works because our library has endeavored to list most ebooks we have in our catalog, and Umlaut tries to do searches of the catalog.But it’s success depends on:
- We have a record in the catalog OR in our link resolver knowledge base for the ebook. (Umlaut tries to combine both sources of information).
- Umlaut successfully finds it, which is somewhat trickier than it sounds, since Umlaut uses some heuristic algorithms to try and balance precision (minimize false positives) with recall (minimize false negatives), as well as avoiding duplicate information when data exists in both the catalog and the link resolver.
- sometimes the ebook record in our catalog has the print ISBN on it too. This will make umlaut’s job easier. Not sure if the SFX knowledge base puts print ISBNs on ebook records.
- Sometimes Umlaut will do a title-author search of our catalog, but whether it does or not is related to complicated heuristics, which could be tuned for this use case and our data if we put some time into it.
But in fact, it does a reasonably good job anyway. Here are some example Umlaut URLs which take a print ISBN, and tell you “what can the library do or provide for this item”, and the result includes licensed ebooks. I’ll include a few title-author input too, to show that’s feasible too.
- (Although it occurs to me that most of these ‘ebooks’ are web readable, not neccessarily downloadable to a reader; not sure if that’s what Andy meant. We don’t have many downloadable ebooks in our collection; if we did, they’d show up here too, ideally identified as such, although that can be tricky).
- http://findit.library.jhu.edu/resolve?rft.title=women under socialism&rft.au=august bebel
- http://findit.library.jhu.edu/resolve?rft.isbn=9780195181142 (this one, Umlaut fails in de-duplicating, and presents two links to the same place, although they’re identified differently.
- http://findit.library.jhu.edu/resolve?rft.isbn=9783540718352 [ditty for this one, duplicate links provided. but that’s better than none at all!]
- http://findit.library.jhu.edu/resolve?rft.title=The%20kidneys%20and%20how%20they%20work [ I don’t know if an NIH PDF counts as an ‘ebook’, but still shows the concept ].
It’s definitely far from perfect, I showed you some succesful positives, finding false negatives would take more time, but I’m sure there in there. (We generally tune Umlaut to avoid false positives, so those are less likely, but there’s surely a few).
Umlaut doesn’t use xISBN or any other “work set expander” service right now, that’d be one obvious improvement, I’d hope to make sometime. Although ideally not before collecting some kind of evidence on how often Umlaut fails for certain tasks in ways that would be improved by a “work set expander”. There are other data sources and other tunings to Umlaut’s heuristics that could be done.
But I think it shows itself pretty admirably anyway. The point is that Umlaut, as an attempted platform serving as “Known Item Service provider”, is a general purpose tool that can handle this specific use case among many others, and the beauty of a general purpose tool is when you improve it for a certain use case, you get unintended benefits to other use cases you hadn’t yet considered, instead of just having very specific tool for very specific use cases. I propose that a Known Item Service provider like Umlaut ought to in fact be a key part of an academic libraries infrastructure.