Cited by from ISI and Scopus in the link resolver

Umlaut now supports links to “cited by” and “similar articles” links from ISI and Scopus.  This is now live in our link resolver. You can see examples, but you won’t be able to actually click on the links provided without a local ID, as they are licensed content. Here’s one example.

You could find more of your own by using Google Scholar and setting your preferences to include Johns Hopkins University– although be aware that you’ll only get ISI and Scopus links if the citation is tracked and can be found in ISI and Scopus respectively. Also, if our link resolver tries to deliver you straight to full text, you may have to click on “See More Options” in the upper right to see the ISI and Scopus links. [Grr, I just noticed that IngentaConnect is “helpfully” forcing OpenURL links upon our users on our behalf. I have not investigated those enough to know if they actually are helpful, or harmful, as such vendor ‘help’ with Google Scholar has been in the past.]

We also have links to Journal Citation Reports ‘impact factor’, and as these are title-level, you can see these in our catalog too, due to Umlaut service integration in the catalog.  Again, actually clicking on the link will require a local login.

Basic Idea

For both Scopus and ISI, the basic idea behind the software is:

  1. Try to look up the citation in the respective database, using some kind of web service api.
  2. If a match is found, generate ‘direct links’ into the respective database for lists of ‘cited by’ and ‘similar articles’ (and in the case of Scopus, ‘more by this author’).Ideally, we’d only display these links if there guaranteed to be results at the end of the link. That’s possible with all links from ISI, and “cited by” from Scopus, but unfortunately not “similar” or “more by these authors” from Scopus.
  3. These lists are displayed in the native Scopus or ISI interface, so you need to be authorized to see them. If you’re off campus, you’ll go through EZProxy.  Since both Scopus and ISI provide OpenURL links on these listings, these lists are ‘actionable’ for the user, they can click on the OpenURL links to return to our system to find full text from a locally licensed provider.

The great thing is that this is possible at all with both services. And was possible without too much trouble. Both ISI and Scopus provided me access to management-level staff who understood my use case and were interested in meeting it. And didn’t try to charge me extra for some kind of api access — I got what I needed as part of our existing institutional subscritpion to these services.

This is great.  I hope that vendors are starting to realize that machine access is not an add on to charge us more money for — it’s a key feature neccesary for us to fully utilize the content we are already paying for, neccesary for it to continue to have value for us.  Both these vendors seem to be starting to realize this, to some extent.

I’d love to provide similar functionality using Google Scholar (which succesfully tracks ‘cited by’ for many articles that neither ISI nor Scopus do, as well as books!)– but Google Scholar does not support this, as far as I can tell quite intentionally.  I’d love to provide similar links using CiteSeerX, but CiteSeerX doesn’t provide the right features to make machine access of this sort convenient (in this case, I assume due to lack of resources, not due to intentionally not wanting me to do it).    So, for once, I get more functionality from the services we’re paying lots of money for than from the services available for free. And isn’t that how it should be, shouldn’t we be getting this for our money?  It’s refreshing to have this actually be the case.

Thanks ISI and Scopus.

Both ISI and Scopus support for this functionality do have some problems though, including some significant ones.  I’ll provide a brief technical description of how I use these services, as well as some technical good, bad, and ugly about each one.

ISI

Possible using the ISI Link Article Match Retrieval Service (LAMR), which was pleasantly designed more or less exactly to support this use case.   The service will take metadata about a citation, and if the citation is found in the ISI database, return links to ‘cited by’ and ‘similar’ lists. Perfect, just what I need.

The service uses an XML-RPC style of interaction, which requires POSTs of XML.  It would be more convenient if the request were just a GET with parameters in the query string, but it’s not too bad. Thank you ISI for not basing this on SOAP.

ISI Good things

Excellent support. ISI seems to have sufficient staff dedicated to supporting this feature. My questions were answered quickly. Bug reports were responded to by people who didn’t dispute they were bugs. Some bugs were even fixed quickly.

The initial API response will only deliver a ‘similar’ link to me if there are actual similar articles in ISI.  I can avoid presenting a ‘similar’ link that will lead to a ‘none found’ message.

ISI Worst thing

Extreme slowness. While the initial API response is actually reasonably speedy (although still not as fast as Scopus), the direct links to the lists delivered by the API response are horribly slow, can take as much as 30 seconds to respond.  This is only true on first request; subsequent requests for an identical URL are very speedy, so there must be some caching going on at ISI’s end.  But 30 seconds is way too long — I’m deploying the function anyway because it seems useful enough, but based on feedback this 30 second wait may make me turn it off.

Update 26 May 2009: Thomson/Reuters/ISI has actually been extremely helpful in aiding us in diagnosing the source of this slowness problem, which seems to be mysterious. We’ve discovered that it’s definitely related to DNS, but it’s not entirely clear to me whether it’s a DNS problem on our local end, or on their end. They believe it’s a problem on our end, I currently have tickets in to my relevant people to try and confirm or deny that hypothesis. So you won’t necessarily have this trouble yourself with the ISI LAMR service — I would be extremely interested to hear if you do or don’t, since that would be another data point in trying to diagnose the cause.

ISI other bad thing

The ISI LAMR API is very picky.  If you give it something it doesn’t like, it’s likely to respond with an error — and not a well-formatted error, but the kind of error that reveals an uncaught exception. To be specific, a weird <null value=”error”> element in the returned XML, with no code or message allowing you to note the nature of the error.

If you send it a bad DOI, you get one of these. And my software has bad DOIs aplenty in the citations given to it. Google Scholar is a particular offender here, sending bad DOIs frequently to Umlaut, which Umlaut then tries to request from ISI.  Even if there are other data elements in the citation that would be sufficient to identify an article, a bad DOI in the request gets you an error.

And what data elements are sufficient to identify an article are somewhat mysterious.  An ISSN without a journal title is never sufficient, that’ll get you either an error or a false negative. You need to send the journal title to get hits. Why? Who knows.   Umlaut usually, but not always, has the journal title to send.  Vol/issue/page# are not sufficient to get hits, even if they uniquely identify an article, you need to send article title too.  Umlaut usually but not always has an article title to send. Umlaut often doesn’t have a proper author, and fortunately the ISI API seems to tolerate this. But all of these (in)tolerances are undocumented. It’s annoying.

ISI more annoyingness

The ISI LAMR api authenticates to server IP address. So if your server IP address changes, you need to file a support request with ISI, and then wait possibly a few days for them to update the access. Annoying.

Scopus

Scopus doesn’t really provide an API meant for this use case. But it does provide a collection of web services that can pieced together to do this.  I guess that’s testament to the fact that if you build some good web services, they can be used in ways you never anticipated. But it leads to some annoyingness and lack of functionality.

First, Umlaut has to use a Scopus API meant for browser-side javascript access. You’ve got to register for that and get an api key.  While meant for browser-side access, they don’t seem to try and stop me from making a request and getting back json server-side.   Thanks to Alf at hublog for figuring that out.

Then, once you identified an item in the Scopus database, you need to use Scopus  ‘Direct Linking’ patterns to actually send the user to the list. Which theoretically requires you to register and get a different key.  And you can’t register on the web, you need to, according to the documentation I got only after registering, “contact your nearest regional Scopus office.” SO inconvenient. Or would be, if you actually needed to do it, but it seems like you can get away using some values associated with the Scopus JSON Search API you already registered for instead.

Scopus Good things

Both the search API request and the subsequent direct links are reasonably fast. The API response is faster than ISI’s, the direct links resolve generally in a few seconds instead of ISI’s 25-30.

The API also mostly Just Works. I haven’t seen any errors from it. It succesfully finds articles that are actually tracked by Scopus, even when all you give it is an ISSN/volume/issue/page#. Great.

Scopus Worst Thing

There is a truly terrible error with cookies I don’t entirely understand. If you leave your browser open for more than a few hours, and used Scopus (any part of scopus) in that browser session more than a few hours ago (not sure exactly how many are a ‘few’, but I think more than 2 and less than 24) — then next time you access scopus (for instance, by trying to follow one of the links that Umlaut gives you), you’ll get a ‘bad session key’ error from Scopus. And Scopus will then redirect you to the Scopus home page. Not to the Direct Link that Umlaut tried to send you to. Next time you click on a link, it’ll work, your session cookies have been fixed — until you wait a few more hours with an open browser.

I’m not even positive that it won’t appear if you close your browser, since I always leave mine open forever.

This is pretty disastrous. I’m going to see how often users actually encounter it.  I hope it doesn’t require me to disable this functionality.

Scopus Other Bad Thing

The original JSON Search API response from Scopus gives me a number of times cited. Great, not only can I use this to label the link, but I can use it to make sure not to provide a link if the count is 0. This is good. (Incidentally, the response also gives me an abstract, which is good, and I do include on the Umlaut page where present).

But, while Scopus gives me a way to generate “Direct Links” to “similar articles” (two different kinds of calculated similarity, actually), and “more by the same authors” — it gives me no good way to pre-check to make sure these lists will be non-zero before sending a user to them. And indeed some of these lists for some articles do end up telling the user “sorry, none found.” After the user clicks on them.

This violates one of my general principles for Umlaut, never to put a link on the page unless I know it’s going to take the user to something useful.  I think if users are confronted with links that give them “sorry, can’t show you anything” too often, they’ll stop clicking on any of the links ever.  Nevertheless, I’m including them because these 0-hit results seem rare enough and the functionality useful enough.

Conclusion

That both ISI and Scopus make this possible at all is a great thing, despite fairly significant flaws with both services that will impact useabiltiy.  I think including these services in Umlaut will expose useful Scopus and ISI functionality (that we’re already paying for) to users who were unaware of it, or found it too inconvenient to use regularly before.  I think these features will increase the value of both ISI and Scopus to us.

I still wish both services would fix their significant flaws.

It’s great that both services give me a ‘cited by’ count, so I can show the user how many citations they’ll see before they click on the link to see them. It would be nice if both services could do this for ‘similar articles’ too (although at least ISI will implicitly tell me if it’s zero or non-zero).   Although this might not be too useful, because both services provide a google-esque list of thousands or even millions of hits, but fortunately ranked so the first page is the most useful.

Both services in their native interface provide a list of referenes from a given article (as compared to ‘cited by’), but I’m not currently providing links to these. But that would be a useful service, since both services have OpenURLs on those lists.   To support that, it would be nice if both services would give me a direct link to those references, and a way to check in advance to make sure there are non-zero references (or even give me a count).

6 thoughts on “Cited by from ISI and Scopus in the link resolver

  1. Dear Sir,

    I need you to clarify Thompson Router ISI and Scopus.

    1. Recently in the Malaysia Institution of Higher Learning system. almost all of them need their academics to publish research paper in high impact journals which index in Thompson ISI or Scopus. Question is which is better?

    2. There is this institution of higher learning in particular that set its own regulation whereby any academician wanted to present research paper in any international conference need to have proof that the conference must index in Scopus. I think is is ridiculous.

    3. I have gone and present a research paper recently and the conference proceeding has ISSN number. Does this mean this international conference has quality? and is index with ISI?

    4. If this institution keeps on barking on Scopus conference and Scopus papers needed to enable academic staff to present oversea, then it is sadden and academic assasisnation by the administrators.

    Please Advice.

  2. John Lee: My opinions only:

    1. Neither, really. Depends on what you want to use it for I guess. They are similar. Neither one is a good idea for ‘proving’ the quality of research, this is a terrible idea, I agree. There is some writing on this you might be able to use to back up your feelings that this is a terrible idea. Here is one example: http://www.istl.org/10-winter/refereed2.html

    And here is a blog post that includes links to more research and opinion: http://scienceblogs.com/bookoftrogool/2010/02/librarians_down_with_the_impac.php

    2. I agree this is completely ridiculous, essentially “outsourcing” the judgement of quality to Scopus or ISI. Very very ridiculous.

    3. No, an ISSN doesn’t mean anything except the publication registered and got an ISSN. There is no judgement of quality made by the people who assign ISSNs, anyone can get one. And you’d have to ask ISI their collection policies, but I’m pretty sure they don’t index every single thing that has an ISSN, not everything with an ISSN is even available electornically, and ISI isn’t scanning it all.

    4. I think I agree.

  3. Hi, thanks for the post! I’m very new to ISI Web of Science APIs so I was just wondering if you knew of a way in which I can search an author by name and get back XML results of publications that the author is in.

    Any advice would be tremendously helpful!

Leave a comment