So I had been operating under the incorrect assumption that OAISter only aggregated feeds which claimed to be of open access materials.
After embarrassingly sending them a letter (and cc’ing code4lib) asking for clarification I noticed their collection development policy page. (Embarrassing because I should have checked first).
- We harvest and retain all records that point to digital resources.
- This includes freely-available and restricted-access digital resources.
While apparently this has always been their policy, until recently the vast majority of what they held seemed to be open access, such that many of us didn’t notice the restricted for-pay stuff in there. In the past 6 months to a year they seemed to have added a bunch of feeds with large amount of restricted for-pay stuff, such that it’s not uncommon to run into it.
It still took me until just now to realize what was going on, and that I couldn’t in fact use OAISter as a reliable search engine of public access content. I bet many of you reading this haven’t realized it yet either, which is why I point this out.
But this is highly unfortunate. I thought I could use OAISter as a search engine covering a large swatch of the public-access scholarly web. I really could use a source I ca swarch with a known title and author to see if the article is available public access somewhere, in a reliable way (yeah, you could just search google, but I’m using this in software that wants to make a decision about what to do with it based on it being public access! I don’t want to present my users with links they can’t access!). I thought that was OAISter, but it’s not.
So is there anything else that will do that for me? I don’t think so. It’s a gaping hole. We need someone to create an OAI-PMH aggregator that, unlike OAISter, will only take feeds of public access content.
To my letter sent without appropriate research.
Hi Jonathan, We have always included more than open access repositories. You can find our collection development policy at: http://oaister.org/restricted.html We developed that policy about a year ago because of questions just like yours. We fully understand the need for an aggregator to only OA materials. However, we are currently in the midst of infrastructure and hosting changes for OAIster, and are not able to undertake that at the moment. We would hope we can achieve this at some point. In the meantime, have you checked out OpenDOAR? They have a service akin to what you're looking for, and they do provide full-text searching. http://opendoar.org/ Please let me know if you have further questions. Regards, -Kat [Hagedorn]
To which I say
Glad they are considering it. And thank them for the pointer to DOAJ, looks like that’s what I _really_ needed when I was using OAISter instead. DOAJ claims to offer an XML API (wish it were jusr SRU instead). Guess I”m off to investigate it and try to implement a DOAJ query into Umlaut.
Their API is just of repositories, not of article-level metadata. They have a custom google search of article-level metadata, but we all know google has no apis anymore. Drat drat drat. So I’m back to having NO available option for searching a large swatch of the public access scholarly internet via API. This is a big problem.
Hmm, I am feeling awfully stymied.