Search hints/related search?

So google and Yahoo both sometimes offer “related” searches, in a nice AJAXy popup.

I don’t have time to find an example to show you, but I think most of you have seen it with Google at least. The firefox google opensearch toolbar for instance. I put in “library” and in a popup it suggests “library of congress; librarything; library thing; library journal” etc. Maybe that wasn’t the best example, but sometimes this is useful.

It strikes me that it would be really nice to have a similar feature in our various library search functions (including catalog and federated search?). First thought is, gee, can I just use the Yahoo and/or Google apis to do this? But I seriously doubt that would be consistent with either of their Terms of Service, to use this service for something that has nothing to do with google/yahoo and isn’t going to lead to a search of google/yahoo, but instead use these suggestions for search of our own content.

So, that gets me thinking, how do you do this? Obviously Google and Yahoo are coming up with these suggestions by analyzing their own data—either their corpus of indexed stuff, their query logs, or likely a combination of both. Anyone know if there are any public basic algorithms for doing this kind of thing? Anyone have enough “information retrieval” knowledge to hazzard a guess as to what sorts of algorithms are used for this? How would we go about adding this to our own apps?

Update: It also occurs to me that this would be ANOTHER natural service for OCLC to provide. To provide “related search” suggestions well, you need a good corpus and some data mining. OCLC has a giant corpus of not only book metadata, but search query history from their database offerings.  An OCLC “search suggestion” API where you give it a query, and it gives you search suggetsions, which you are licensed to use in any search your library has? I’d reccomend my library pay for that, if the price was right.  Natural service from OCLC.

This entry was posted in Practice, programming. Bookmark the permalink.

5 Responses to Search hints/related search?

  1. Jon Gorman says:

    I believe from what I have heard and seen that they probably rely pretty heavily on their search logs and matching terms to previous searches. (Similar to google suggest, I believe).

    You probably could start doing something similar by looking for common terms/search combinations in previous searches. Since at least for our OPACS and digital libraries we have a smaller set of terms you could also roll that into a related search suggestion.

    And if you really wanted to get fancy, you could compare the terms in the search to some generated clusters and see if any of them fit. That would take quite a bit of tweaking though.

  2. David Bgwood says:

    It migth make more sense and be more useful to the users if it was based on local searches, not a global set like OCLC. In my institution Mars always means the planet. In a business school it most likely refers to the candy company. Generic searches from OCLC would be just noise to most special libraries. For publics and schools they would not reflect local interests.

  3. Peter Murray says:

    You might want to check out Dave Pattern’s decorative tag cloud. It uses keyword harvested information to generate a tag cloud suggesting refinements to searches.

  4. jrochkind says:

    Thanks David. And yet, somehow Google suggested searches work out decently (at least they seem to, to me) despite not reflecting local interests.

    Have to check out Dave Pattern’s stuff, thanks Peter.

  5. Jenn Riley says:

    When a term in the search matches an authorized or lead-in term from a relevant controlled vocabulary, all terms with an RT relationship to that term could be displayed as suggested searches. You could also do this with BTs, and NTs, if you’re not automatically expanding them. This is definitely something OCLC’s evolving terminology services could support.

    I’m not saying this is the *only* place we could look for term suggestions, but it’s an easy one as we already have the data about the relationships.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s