federated search user interface

Here’s a field report about some negative aspects of a particular federated search user experience, not certain by whom, sorry.

I think some of those specific issues cited have actually been dealt with pretty well in David Walker’s Xerxes design. For instance, it gets the user to full text much more reliably (although still not perfectly) than most broadcast search interfaces.

I think Xerxes does a lot better than any commercial out of the box broadcast search software I’ve seen. But there are limits to broadcast search.  There’s a point at which incremental improvements take increasing development. There are a few more tweaks that can be made on the technology level, that vendors have, are, or could work on:   Faster resonse time, better machine analysis (relevancy ranking, de-duping, clustering).

I’ve suggested before that  you can get a lot better interface out of a local index solution than a broadcast search solution, if you can manage to get a robust local index of all our licensed scholarly content.  Which is not easy, but if you really want a good cross-search interface, is easier than trying to make broadcast search do magic.

I think the Serial Solutions Summon service is awfully interesting; providing a centralized licenseable index of cross-vendor article-level metadata.  That is, letting the user search a large database of scholary material (like Google Scholar), that is (not like G.Sch.) balanced accross disciplines, predictable in content,  normalized and structured (useful for better link resolver integration and much more), and available to customers through an full-featured API.

If they can succeed in providing access to nearly as much and diverse content as broadcast searching, can provide a reliably robust and quick-running service, and can price it in the general order of magnitude of existing broadcast search services — I think this could be the future of scholarly meta-search.

6 thoughts on “federated search user interface

  1. Good article!

    I have two questions.

    First, how does David Walker’s Xerxes design address the common federated search problems, that often created at the remote source (i.e. not the fault of the federated search software)? I ask, because the article you cited mentioned specific source problems (i.e. changes in the source that cause connectors to break), lack of full text, and long period of time for some results.

    Second, what do you think of the interface and capabilities of technology that is displayed at http://www.mednar.com, http://www.biznar.com and http://www.scitopic.org?

    Thanks. Larry.

  2. Hi Larry.

    I am not familiar with the sites you mention or the technology behind them. I see that you work for the company supplying that technology. I would appreciate it in the future if you (and other commenters) would disclose with your comments any business interest in products you are plugging in my comments. But since you are with this company, I’d turn the question around to you: How do YOU think that technology deals with the problems identified in the critique? You’re more familiar with it than I.

    As far as Xerxes…

    Xerxes can’t address the inherent problems with broadcast search, including problems created at the remote source.

    I think it does close to as well as can be done within those parameters. I was particularly thinking of the lack of full text links. Metalib, the commercial system that Xerxes was built upon, DOES sometimes provide links to full text when provided by the remote source. Other times it provides OpenURL links to an institutional link resolver, which should be able to find the full text if it’s present. But Metalib’s native interface seems to almost perversely hide these links from the user.

    Xerxes does a somewhat better job of extracting full text links from the records returned by the remote source than Metalib alone did. It also attempts to label the links as PDF or HTML where possible. It also does a somewhat better job of constructing a good OpenURL than metalib did. It then does a better job of putting the _appropriate_ link (direct to full text if available, otherwise link resolver) front and center where the user can’t miss it — this is after all what the user is probably most interested in! Directly on the search results screen, as well as on the item detail page.

    Xerxes also does the best it can to allow users to refine their search on the same page as the results are displayed on (another specific problem noted by that article), using facets/clusters provided by the underlying Metalib system, as well as taking the simple step of putting a search box on the results page, pre-filled with the current search and awaiting a refined search.

    These are all just improvements within the broadcast search paradigm’s limitations. Some are trickier than others, although in general I’m amazed that most commercial broadcast search solutions don’t do a better job than they do.

    But Xerxes can’t do anything about long search times, or connectors breaking when the source changes. These are inherent problems to the broadcast search paradigm, that nothing can really be done about — at least not without more coordinated and conscientious activity by the various remote sources. Which is why I’ve thought for a while that the approach taken by Summon is the future, although it is also not a cheap or easy approach.

  3. Sorry, Jonathan.

    I honestly overlooked the standard disclaimer that I work for a company providing some of the technologies…

    I work for Deep Web Technologies (www.deepwebtech.com), and yes, produce some of the websites I mentioned above.

    I’ll attempt to answer your question directed to me, by discussing your last paragraph.

    I do think there are ways to help mitigate the effects of traditional issues associated with federated search. For example:

    – Long search times? Provide incremental results.
    – Connectors breaking? Better workflow to route problems to the appropriate personnel. Utilize a vendor to fix connectors within an SLA.

    I could go on. Summon helps by providing unified search into the mix, but it won’t solve these problems either for those sources that can’t be indexed.

    Larry.

  4. Yeah, you’re right, those are the approaches to take. I think that the vendor workflow issue tends to be very resource intensive for diminishing gains, although it’s really the only option for improvement there.

    I am skeptical of it being possible to provide a decent interface with incremental results. How do you merge the later results into the search set without really confusing the user? While at the same time without merging them in such a way that most users simply ignore them, in which case you might as well not even include slow-performing sources, but then we’re back at square zero. There might be tweaks you can perform here, but I think it’s going to be hard to get a really good interface with this technique.

  5. Good point. In the sample websites I provided in my original comment, they do provide incremental results, but ask the user before incorporating new results into the existing results.

    We’ve found that different clients prefer different approaches, with repeat visitors appearing to appreciate incremental results more than first-time visitors (first-time visitors do tend to get a bit confused).

    We’re about to commission a UI expert to give us some additional feedback and support around this, with the possible addition of better visuals to help first-time users understand what’s going on.

    Larry.

  6. Hi all,

    I need to implement incremental results on my application. I am right now forking out child processes to get results from sources and the parent process waits for all children to finish up hence is the need for incremental results .can anyone please give me the basic way to do that technically. how shud the execution done of the application.. i am using perl for my app.

    Poonam

Leave a comment