So Google has announced a much-awaited api for pre-checking availability of full text in Google Books. Here is one post with more detail than other announcements I’ve found.
However, there’s no technical reason why you couldn’t do this server-side as well. It’s just an HTTP GET request with certain query parameters which returns JSON. I can certainly parse JSON server-side.
So there’s no technical reason I can’t do it server-side. But Google may certainly stop me with policy. They could rate-limit requests to the API from any given IP (and it sounds like they DO: “Because developers often issue an atypical quantity of requests, you may accidentially tip the security precautions found in Google Book Search.” ). Google certainly has it’s own business reasons to want to aggregate as much individual data as possible, not let my application be an intermediate proxy. (Google is in fact in the business of collecting usage data, not of providing search. Think about how they make their money). So hmm, time will tell.
Eagerly awaiting more information about this. Not quite sure how to get it.
updates (14 Mar):
Got a reply from Google:
You can do something similar to this on the client side, just add some logic
on the viewability information.
Unfortunately, we don’t support server side querying of the API, because
viewability is based on local rights limitations (different countries
consider different books to be public domain), and we think it hurts theuser experience to provide incorrect viewability information.
Doh! This doesn’t really answer my concern I’m afraid, I really can’t do what I need to do client side, at least not without extreme difficulty or loss of functionality. But I guess that’s how it’ll be!
Had the idea of asking the ksu SFX example for an XML response, to see what that tells us. Of course everything in the XML response is neccesarily generated server side. The Google Books section looks like this:
<target_public_name>Google Book Search</target_public_name>
Hmm. It’s hard to say. The XML response does not include what’s in the HTML response telling you for sure what kind of access is available. But it does include a google books URL that sure looks like it required talking to Google on the server-side to generate–it doesnt’ have an ISSN in it, it has a Google unique ID. How would SFX know that Google unique ID without talking to Google on the server-side? Which Google tells me they don’t allow. Hmm. Curioser.