In the current environment, it’s an embaressing shame that not all of our catalog interfaces allow direct links to particular search results in the catalog. It’s neccesary to integrate ourselves into the workflows and communities of our patrons.
Our Blacklight-based catalog (not yet fully in production, but in a public beta) naturally does allow such things. Which allows our research librarians to send tweets like this one to these results. Note how after following that link, you the viewer can, in addition to just looking at what’s there, also easily modify the search, adding or removing limits or changing the search query.
Nice job to whatever anonymous reference librarian here sent that tweet!
Also, for better or for worse, Google is crawling our Blacklight based public beta, and ends up indexing many of these internal ‘deep links’ to search results too. For instance, at least on my computer at the time I write this (Google index seems to be different for different viewers these days, and certainly always changes over time), a Google search for “Gil Scott Heron thesis” has as it’s third result this deep link into our Blacklight-based catalog results.
I guess it’s kind of neat that we’re in Google, although the utility of this particular link is questionable. Firstly, for that particular query, the first bib on those search results is probably more relevant than the search results themselves. Secondly, if you’re a random Google searcher who is not at our institution, I’m not sure how useful it is — if you’re looking for Gil Scott-Heron’s (MFA in creative writing, I guess?) thesis — to find out that we’ve got it here in print at our library (original manuscript even), but without any links to digitized copies viewable online (I’m honestly not sure if we have it in our IR publically available or not; I think not yet), or any other way for you to get at it.
(That a search like this on Google will point to our catalog discovered by colleague Sean from looking at our Google Analytics for the catalog app, thanks Sean).
What’s your rationale for allowing bots to index searches (as opposed to restricting everything but direct-to-the-record links and creating a sitemap)?
None, Bill, it wasn’t intentional, just sort of naive default behavior, haven’t gotten around to wiring up sitemap creation. Creating a sitemap for a 4 million document Solr index seems potentially tricky. Are you creating a sitemap from your Solr index, and if so, wanna share details?
I’m thinking if you run the SiteMap generation on your live Solr index, you’re going to put a serious load on it (Solr isn’t so great at paging to the end of a large result set, in this case one as big as the entire index), as well as mess up your document cache pushing all the real stuff off the end of it. So possibly run it on my ‘master’ server I index to but don’t serve production queries from, but I’m still worried it could be an awfully slow process, perhaps interfering with ongoing indexing operations. But maybe my fears are misplaced and it’ll just work even with the naive implementation, haven’t had time to mess with it.
That was Jen Darragh on Twitter duty last week; she was responsible for tweeting catalog results and we’re not just tweeting them, we’re pasting them into e-mails, starting to put them into blog posts, etc. Great function of Catalyst!