Update Aug 2010: I’ve done some time-distribution visualization with Flot, Solr, and Blacklight, not quite what’s contemplated in this post, but it’s kinda neat. https://bibwild.wordpress.com/2010/07/29/cool-range-limitprofile-function-in-blacklight/
So I’ve been thinking for a while about visualizing time distribution in an OPAC view. Things in our catalog generally have a year they were published, or a range of years for a serial; and sometimes are about a particular time period too.
(Also, Google seems to mix together date of web page publishing as data point with dates mentioned in the web page as data points, which seems kind of an odd choice, but that’s a different topic, here I’m mostly thinking about interfaces for visualizing a timeline of dates, not how you choose what dates to put on the timeline).
What I wound up finding was flot. Which is not for timeline visualization specifically, it’s a general purpose data visualization jQuery plugin. And man is it super neat! Incredibly powerful and flexible, but with a very simple concise and easy to use to API, and incredibly slick looking visualizations too. It’s super neat! (I think a good principle of any kind of API design (or really any kind of system design at all) is that simple things should be simple to do; more complicated things can be more complicated to do, although should still be as simple as you can make them. Flot does well here).
Imagine this type of visualization (seriously, click on that link, it’s pretty sweet) of catalog timeline data. I like the two linked charts (overview, and zoom-in; similar to the Simile version and what Google kind of sort of klunkily does), and you can make selections in either one (click and drag to make a selection; also drag-panning). And view source to see how amazingly few and simple lines of JS were required to draw that, wow!
Just add some labelled vertical lines (which flot is quite capable) of. Now, when you make a selection, you could get an immediately changed list of bib results in another part of the screen (bottom or side). And/or, when you mouseover (or click) on a particular year (or range, depending on zoom level), you could get a pop-up window listing the bibs in the time you clicked on.
Totally do-able with flot. Wow, flot is neat!
It’s not entirely clear to me how you’d deal with items that have a range of dates instead of one particular date in that visualization though. (Like a serial, or a book about the 18th century). An ‘item’ with a range instead of a fixed date is one thing that the Simile widget is set up for, but neither the Google version nor any of the flot examples show. But if you can think of how to do it visually, I bet flot is probably flexible enough to let you do it.
Maybe some day I’ll get to play around with that. No day any time soon I don’t think, sadly. Sometimes I feel like I am continually building the basic boring parts of my systems to bare level of competence — and just when I think I’ve got that done and can finally start doing some really cool stuff on the platform I’ve built, nope, there’s a different system that I’ve got to work on getting to the level of basic robust competent platform. Oh well, some day.
7 thoughts on “idle thoughts: timeline visualization in a catalog”
As soon as I read this I got excited and rewrote the engine that harvest subjects in my OPAC search results to generate a “tag” cloud, so that it also harvests the publication dates. Now my problem is, I have the data, but only for the search results visible on the screen. I’m limited to client-side manipulations.
Best case scenario, I’ve got 50 dates in an array to work with. Is that a useful data set to generate a timeline? What if there were 800 search results, then what good is a timeline representing only the 50 results on my screen? D’oh.
So wait, your tag cloud is just a cloud of tags on that particular screen? So if there were 800 results, you only get tags from the first 10-50 in the cloud? Do you have an idea of how/whether people find this useful? Ah, III, huh? You certainly could make a little mini “spark line” bar chart or something on the page for just that page of results, but I’m not sure how useful it would be either.
I think this is a pretty nice example [Keyword = Boeing] of what you get when all the results fit on a single results screen. In this example of multi-page results [Keyword = Ford] you can really see that the subject content of each page of results varies greatly from the last. This is where having the whole data set could help.
That is essentially what III does in Encore, though I suppose they get to harvest the subjects from the whole results set. I can think of a few ways to call up and scrape more pages of results, but the performance hit would be terrible.
The spark line presentation of the dates is an interesting idea, I think I’ll have to play with a generator.
I am actively attempting (thanks to summer break!) to code up a timline with flot right now. It sounds very very similar (with tags/etc) to all these ideas floating about. However, I’m failing (at this moment) to zoom on timelines. Whenever I get a working demo, I’m sure I’ll post, and hopefully trackback here..
Have you had any luck thus far?
FYI, my data-set is all life-based activities: from weather temperatures, RSS-feeds, files edited, email counts, etc. I hope to eventually use a spam-filter/AI to gain the content-topics on all of ’em. Ambitious, I know.
I haven’t actually written any code yet; but did you try to model your code on the example I linked to? It does provide a zoom example, and is only a few lines of code.
(Of course I chimed in without fully comprehending your problem. But Protovis does an admirable job of doing a new sort of API for JS dataviz.)