Hmm, it kind of looks like Google Scholar may have accomplished something I’ve been trying to figure out how to do for quite a while — linking a published citation to an open access pre-print found somewhere.
Am I understanding the results properly?
Check out this result list for a topic that came into my head and I decided to search Scholar for.
It looks to me that if you click on the title, you get taken to a publisher paywall. But in the right-hand column next to many of the hits (5 of 10 on this page) is a link which seems to be an open access pre-print. No? And they actually manage to link right to the PDF too, not to an annoying DSpace/Fedora/Whatever landing page that makes it really confusing to find the additional click to the actual PDF.
Anyone have any clues as to what’s going on, or how they’re doing it?
Man, I really wish Google Scholar had an API. I’d really like my link resolver (Umalut) to be able to alert someone to an open access pre-print for the citation they’ve found. But I haven’t been able to find a reliable way to do it (It is deeply sad to me that searching, for instance, DOAR isn’t good enough, because repo’s listed in DOAR actually do have all sorts of embargoed and otherwise restricted content in it too, despite their advertised collection policy, and there’s no way to tell which is which. They also don’t provide a search service, other than a Google custom search, but that could maybe worked around if one could be confident hits in their collection really were OA. (OAISter also includes non-OA content, although OAISter doesn’t pretend otherwise).
Really, this is a failing in common metadata — most OAI-PMH harvests will include embargoed and otherwise restricted content along with OA content, with absolutely no metadata advertisement of which is which. And then there’s the fact that most OAI-PMH harvests will advertise links to the forementioned annoying landing page, not to the actual assets — and again, no common metadata schema is being used to advertise what a link will actually lead to. Really, I’m deeply dissapointed that this kind of thing — good metadata that will allow software to know if an item really is OA, and to get a link directly to the content as well as the landing page — doesn’t seem to be a concern of the repository and communities. This has been a problem for YEARS, and if any of the various organizations involved in this stuff are even making any efforts to address it, I haven’t heard about it.
The other interesting thing that occurs to me, as I play around with this more, is that many of the PDF links G. Scholar finds are in fact NOT pre-prints. They appear to be the actual page images of the final published version. Often hosted in the personal web areas of one of the authors (guessing from the URLs, that include a tilde and lastname, or the name of a lab or research group). Wonder how many of these the author actually has the publishers permission to do this with, and how many not? Dorothea, you reading this? What do you think?