DOAJ API in bento_search 1.5

bento_search is a gem I wrote that lets you search third party search engine APIs with standardized, simple, natural ruby API. It’s focused on ‘scholarly’ sources and use cases.

In the just-released version 1.5, a search engine adapter is included for the Directory of Open Access Journals (DOAJ) article search api.

While there certainly might be circumstances where you want to provide end-users with interactive DOAJ searches, embedded in your application, my main thoughts of use cases are different, and involve back-end known-item lookup in DOAJ.

It’s not a coincidence that bento_search introduced multi-field querying in this same 1.5 release.

The SFX link resolver  is particularly bad at getting users to direct article-level links for open access articles. (Are products from competitors like SerSol or OCLC any better here?). At best, you are usually directed to a journal-level URL for the journal title the article appears in.

But what if the link resolver knew it was probably an open access journal based on ISSN (or at the Umlaut level, based on SFX returning a DOAJ_DIRECTORY_OPEN_ACCESS_JOURNALS_FREE target as valid).  You could take the citation details, look them up in DOAJ to see if you get a match, and if so take the URL returned by DOAJ and return it to the user, knowing it’s going to be open access and not paywalled.

searcher =
results = => {
    :issn       => "0102-3772",
    :volume     => "17",
    :issue      => "3",
    :start_page => "207"
if results.count > 0
   url =
   # => ""
   # hey, maybe we got a doi too. 
   doi = results.first.doi
   # => "10.1590/S0102-37722001000300002"

Or if an application isn’t sure whether an article citation is available open source or not, it could check DOAJ to see if the article is listed there.

Perhaps such a feature will be added to Umlaut at some point.

As more and more is published open access, DOAJ might also be useful as a general large aggregator for metadata enhancement or DOI reverse lookup, for citations in it’s database.

Another known-item-lookup uses of DOAJ might be to fetch an abstract for an article in it’s database.



For anyone interested in using the DOAJ Article Search API (some of whom might arrive here from Google), I found the DOAJ API to be pretty easy to work with and straightforward, but I did encounter a couple tricky parts that are worth sharing.

URI Escaping in a Path component

The DOAJ Search API has the query in the path component of the url, not a query param: /api/v1/search/articles/{search_query]

In the path component of a URI, spaces are not escaped as “+” — “+” just means “+”, and will indeed be interpreted that way by the DOAJ servers.  (Thanks DOAJ api designers for echo’ing back the query in the response, to make my bug there a bit more discoverable!) Spaces are escaped as “%20”.  (Really, escaping spaces as “+” even in query param is an odd legacy practice of unclear standards compliance, but most systems accept it, in the query params after the ? in a URL).

At first I just reached for my trusty ruby stdlib method `CGI.escape`, but that escapes spaces as `+`, resulting in faulty input to the API.  Then I figured maybe I should be using ruby `URI.escape` — that does turn spaces into “%20”, but leaves some things like “/” alone entirely. True, “/” is legal in a URI, but as a path component separator! If I actually wanted it inside the last path component as part of the query, it should be escaped as “%2F”. (I don’t know if that would ever be a useful thing to include a query to this API, but I strive for completeness).

So I settled for ruby `CGI.escape(input).gsub(“+”, “%20″)` — ugly, but okay.

Really, for designing API’s like this, I’d suggest always leaving a query like this in a URI query param where it belongs (” “).  It might initially seem nice to have URLs for search results like ” “, but when you start having multi-word input, or worse complex expression (see next section), it gets less nice quick: ”

Escaping is confusing enough already; stick the convention, there’s a reason the query component of the URI (after the question mark) is called the query component of the URI!

ElasticSearch as used by DOAJ API defaults to OR operator

At first I was confused by the results I was getting from the API, which seemed very low precision, including results that I wasn’t sure why.

The DOAJ Search API docs helpfully tell us that it’s backed by ElasticSearch, and the query string can be most any ElasticSearch query string. 

I realized that for multi-word queries, it was sending them to ElasticSearch, with the default `default_operator` of “OR”, meaning all terms were ‘optional’. And apparently with a very low (1?) `minimum_should_match`

Meaning results included documents with just any one of the search terms. Which didn’t generally produce intuitive or useful results for this corpus and use case — note that DOAJ’s own end-user-facing search uses an “AND” default_operator producing much more precise results.

Well, okay, I can send it any ElasticSearch query, so I’ve just got to prepend a “+” operator to all terms, to make them mandatory. Which gets a bit trickier when you want to support phrases too, as I do; you need to do a bit of tokenization of your own. But doable.

Instead of sending query, which the user may have entered, as:  apple orange “strawberry banana”

Send query: +apple +orange +”strawberry banana”

Or for a fielded search:  bibjson.title:(+apple +orange +”strawberry banana”)

Or for a multi-fielded search where everything is still supposed to be mandatory/AND-ed together, the somewhat imposing:  +bibjson.title:(+apple +orange +”strawberry banana”) +rochkind)

Convoluted, but it works out.

I really like that they allow the API client to send a complete ElasticSearch query, it let me do what I wanted even if it wasn’t what they had anticipated. I’d encourage this pattern for other query API’s — but if you are allowing the client to send in an ElasticSearch (or Solr) query, it would be much more convenient if you also let the client choose the default_operator (Solr `q.op`), and `minimum_should_match` (Solr `mm`).

So, yeah, bento_search

The beauty of bento_search is that one developer figures out these confusing idiosyncracies once (and most of the bento_search targets have such things), encode them in the bento_search logic — and you the bento_search client can be blissfully ignorant of them, you just call methods on a BentoSearch::DoajArticlesEngine same as any other bento_search engine (eg‘apple orange “strawberry banana”‘), and it takes care of the under-the-hood api-specific idiosyncracies, workarounds, or weirdness.

Notes on ElasticSearch

I haven’t looked much at ElasticSearch before, although I’m pretty familiar with it’s cousin Solr.

I started looking at the ElasticSearch docs since DOAJ API told me I could send it any valid ElasticSearch query. I found it familiar, from my Solr work, they are both based on Lucene after all.

I started checking out documentation beyond what I needed (or could make use of) for the DOAJ use too, out of curiosity. I was quite impressed with ElasticSearch’s feature set, and it’s straightforward and consistent API.

One thing to note is ElasticSearch’s really neat query DSL that lets you specify queries as a JSON representation of a query abstract syntax tree — rather than just try to specify what you mean in a textual string query.  For machine-generated queries, this is a great feature, and can make it easier to specify complicated queries than in a textual string query — or make certain things possible that are not even possible at all in the textual string query language.

I recall Erik Hatcher telling me several years ago — possibly before ElasticSearch even existed — that a similar feature was being contemplated for Solr (but taking XML input instead of JSON, naturally).   I’m sure the hypothetical Solr feature would be more powerful than the one in ElasticSearch, but years later it still hasn’t landed in Solr so far as I know, but there it is in ElasticSearch….

I’m going to try to keep my eye on ElasticSearch.

Posted in General | 1 Comment

bento_search 1.5, with multi-field queries

bento_search is a gem I wrote that lets you search third party search engine APIs with standardized, simple, natural ruby API. It’s focused on ‘scholarly’ sources and use cases.

Version 1.5, just released, includes support for multi-field searching:

searcher = ENV['SCOPUS_API_KEY'])
results = => {
    :title  => '"Mystical Anarchism"',
    :author => "Critchley",
    :issn   => "14409917" 

Multi-field searches are always AND’d together, title=X AND author=Y; because that was the only use case I had and seems like mostly what you’d want. (On our existing Blacklight-powered Catalog, we eliminated “All” or “Any” choices for multi-field searches, because our research showed nobody ever wanted “Any”).

As with everything in bento_search, you can use the same API across search engines, whether you are searching Scopus or Google Books or Summon or EBSCOHost, you use the same ruby code to query and get back results of the same classes.

Except, well, multi-field search is not yet supported for Summon or Primo, because I do not have access to those proprietary projects or documentation to make sure I have the implementation right and test it. I’m pretty sure the feature could be added pretty easily to both, by someone who has access (or wants to share it with me as an unpaid ‘contractor’ to add it for you).

What for multi-field querying?

You certainly could expose this feature to end-users in an application using a bento_search powered interactive search. And I have gotten some requests for supporting multi-field search in our bento_search powered ‘articles’ search in our discovery layer; it might be implemented at some point based on this feature.

(I confess I’m still confused why users want to enter text in separate ‘author’ and ‘title’ fields, instead of just entering the author’s name and title in one ‘all fields’ search box, Google-style. As far as I can tell, all bento_search engines perform pretty well with author and title words entered in the general search box. Are users finding differently? Do they just assume it won’t, and want the security, along with the more work, of entering in multiple fields? I dunno).

But I’m actually more interested in this feature for other users than directly exposed interactive search.

It opens up a bunch of possibilities for a under-the-hood known-item identification in various external databases.

Let’s say you have an institutional repository with pre-prints of articles, but it’s only got author and title metadata, and maybe the name of the publication it was eventually published in, but not volume/issue/start-page, which you really want for better citation display and export, analytics, or generation of a more useful OpenURL.

So you take the metadata you do have, and search a large aggregating database to see if you can find a good match, and enhance the metadata with what that external database knows about the article.

Similarly, citations sometimes come into my OpenURL resolver (powered by Umlaut) that lack sufficient metadata for good coverage analysis and outgoing link generation, for which we generally need year/volume/issue/start-page too. Same deal.

Or in the other direction, maybe you have an ISSN/volume/issue/start-page, but don’t have an author and title. Which happens occasionally at the OpenURL link resolver, maybe other places. Again, search a large aggregating database to enhance the metadata, no problem:

results = => {
    :issn       => "14409917",
    :volume     => "10",
    :issue      => "2",
    :start_page => "272"

Or maybe you have a bunch of metadata, but not a DOI — you could use a large citation aggregating database that has DOI information as a reverse-DOI lookup. (Which makes me wonder if CrossRef or another part of the DOI infrastructure might have an API I should write a BentoSearch engine for…)

Or you want to look up an abstract. Or you want to see if a particular citation exists in a particular database for value-added services that database might offer (look inside from Google Books; citation chaining from Scopus, etc).

With multi-field search in bento_search 1.5, you can do a known-item ‘reverse’ lookup in any database supported by bento_search, for these sorts of enhancements and more.

In my next post, I’ll discuss this in terms of DOAJ, a new search engine added to bento_search in 1.5.

Posted in General | 1 Comment

Oyster commercial ebook lending library shutting down

I’ve written a couple times about Oyster, the commercial ebook lending library, and what it might mean for the future of the book marketplace. (Dec 2013, April 2015).

So it seems right to add the coda — Oyster is going out of business.

One of the challenges that Oyster faced faced was having to constantly placate publishers concerns.  The vast majority of them are very apprehensive about going the same route music or movies went.

In a recent interview with the Bookseller, Arnaud Nourry, the CEO of Hachette said“We now have an ecosystem that works. This is why I have resisted the subscription system, which is a flawed idea even though it proliferates in the music business. Offering subscriptions at a monthly fee that is lower than the price of one book is absurd. For the consumer, it makes no sense. People who read two or three books a month represent an infinitesimal minority.”

Penguin Random House’s CEO Tom Weldon echoed Arnaud’s sentiments at the Futurebook conference a little awhile ago in the UK. “We have two problems with subscription. We are not convinced it is what readers want. ‘Eat everything you can’ isn’t a reader’s mindset. In music or film you might want 10,000 songs or films, but I don’t think you want 10,000 books.”

–– Oyster is Shutting Down their e-Book Subscription Service by Michael Kozlowski 

The closure of Oyster comes two months after Entitle, another e-book subscription service, closed. With Entitle and now Oyster gone there is one remaining standalone e-book service, Scribd, as well Amazon’s Amazon Unlimited service.

–– Oyster Is Shutting Down Operations, Publisher’s Weekly

What could have done Oyster in? Oh, I don’t know, perhaps another company with a subscription e-book service and significantly more resources and consumers. Like, say, Amazon? It was pretty clear back when Amazon debuted “Kindle Unlimited” in July 2014 that the service could spell trouble for Oyster. The price was comparable ($9.99 a month) as was the collection of titles (600,000 on Kindle Unlimited as compared to about 500,000 at the time on Oyster). Not to mention that Amazon Prime customers already had complimentary access to one book a month from the company’s Kindle Owner’s Lending Library (selection that summer: more than 500,000). In theory, Oyster’s online e-book store was partly created to strengthen its bid against Amazon, but even here the startup was fighting a losing battle, with many titles priced significantly higher there than on Jeff Bezos’ platform.

Where Oyster failed to take Amazon on, however, it’s conceivable that Google plus a solid portion of Oyster’s staff could succeed. The Oyster team has the experience, while Google has the user base and largely bottomless pockets. By itself, Oyster wasn’t able to bring “every book in the world” into its system. But with Google, who knows? The Google Books project, a sort of complement to the Google Play Store, is already well on its way to becoming a digital Alexandria. Reincarnated under the auspices of that effort, Van Lancker’s dream may happen yet.

Posted in General | 1 Comment

Optional gem dependencies

I’ve sometimes wanted to release a gem with an optional dependency. In one case I can recall, it was an optional dependency on Celluloid (although I’d update that to use concurrent-ruby if I did it over again now) in bento_search.

I didn’t want to include (eg) Celluloid in the gemspec, because not all or even most uses of bento_search use Celluloid. Including it in the gemspec, bundler/rubygems would insist on installing Celluloid for all users of my gem — and in some setups the app would also actually require Celluloid on boot too. Requiring celluloid on boot will also do some somewhat expensive setup code, run some background threads, and possibly give you strange warnings on app exit (all of those things at least in some versions of Celluloid, like the one I was developing against at the time; not sure if it’s still true). I didn’t want any of those things to happen for most people who didn’t need Celluloid with bento_search.

But rubygems/bundler has no way to specify an optional gem dependency. So I resorted to not including the desired optional dependency in my gemspec, but just providing documentation saying “If you want to use feature X, which uses Celluloid, you must add Celluloid to your Gemfile yourself.”

What I didn’t like was there was no way, other than documentation, to include a version specification for what versions of Celluloid my own gem demanded, as an optional dependency. You don’t need to use Celluloid at all, but if you do, then it must be a certain version we know we work with. Not too old (lacking features or having bugs), but also not too new (may have backwards breaking changes; assuming the optional dependency uses semver so that’s predictable on version number).

I thought there was no way to include a version specification for this kind of optional dependency. To be sure, an optional gem dependency is not a good idea. Don’t do it unless you really have to, it complicates things. But I think sometimes it really does make sense to do so, and if you have to, it turns out there is one way to deal with specifying version requirements too.

Because it turns out Rails agrees with me that sometimes an optional dependency really is the lesser evil. The ActiveRecord database adapters are included with Rails, but they often depend on a lower-level database-specific gem, which is not included as an actual gemspec dependency.

They provide a best-of-dealing-with-a-bad-situation pattern to specify version constraints too: use the runtime (not Bundler) `gem` method that rubygems provides, as at:

This will get executed only if you are requiring the mysql2_adapter. If you are, it’ll try to load the `mysql2` gem with that version spec, in that version of mysql2_adapter `~> 0.3.13‘` If the “optional dependency” is not loaded at all (because you didn’t include it in your Gemfile), or a version is loaded that doesn’t match those version requirements, it’ll raise a Gem::LoadError, which Rails catches and re-raises with a somewhat better message:

Of course, this leads to a problem many of us have run into over the past two days since mysql2 was released.  The generated (or recommended) Gemfile for a Rails app using mysql2 includes an unconstrained `gem “mysql2″`. So Bundler is willing to install and use the newly released mysql2 0.4.0.  But then the mysql2_adapter is not willing to use 0.4.0, and ends up raising the somewhat confusing error message:

Specified ‘mysql2’ for database adapter, but the gem is not loaded. Add `gem ‘mysql2’` to your Gemfile (and ensure its version is at the minimum required by ActiveRecord).

In this case, mysql2 0.4.0 would in fact work fine, but mysql2_adapter isn’t willing to use it. (As an aside, why the heck isn’t the mysql2 gem at 1.0 yet and using semver?)  As another aside, if you run into this, until Rails fixes things up, you need to modify your app Gemfile to say `gem ‘mysql2’, “< 0.4.0″`, since Rails 4.2.4 won’t use 0.4.0.

The error message is confusing, because the problem was not a minimum specified by ActiveRecord, but a maximum.  And why not have the error message more clearly tell you exactly what you need?

Leaving aside the complexities of what Rails is trying to do and the right fix on Rails’ end, if I need an optional dependency in the future, I think I’d follow Rails lead, but improve upon the error message:

   gem 'some_gem', "~&gt; 1.4.5"
rescue Gem::LoadError =&gt; e
   raise Gem::LoadError, "You are using functionality requiring 
     the optional gem dependency `#{}`, but the gem is not
     loaded, or is not using an acceptable version. Add 
     `gem '#{}'` to your Gemfile. Version #{MyGem::VERSION}
     of my_gem_name requires #{} that matches #{e.requirement}"

Note that the Gem::LoadError includes a requirement attribute that tells you exactly what the version requirements were that failed. Why not include this in the message too, somewhat less confusing?

Except I realize we’re still creating a new Gem::LoadError, without those super useful `name` and `requirement` fields filled out. Our newly raised exception probably ought to copy those over properly too. Left as an exersize to the reader.

I may try to submit a PR to Rails to include a better error message here.

Optional dependencies are still not a good idea. They lead to confusingness like Rails ran into here. But sometimes you really do want to do it anyway, it’s not as bad as the alternatives. Doing what Rails does seems like the least worst pattern available for this kind of optional dependency: Use the runtime `gem` method to specify version constraints for the optional dependency; catch the `Gem::LoadError`;  and provide a better error message for it (either re-raising or writing to log or other place developer will see an error).

Posted in General | 8 Comments

Memories of my discovery of the internet

As I approach 40 years old, I find myself getting nostalgic and otherwise engaged in memories of my youth.

I began high school in 1989. I was already a computer nerd, beginning from when my parents sent me to a Logo class for kids sometime in middle school; I think we had an Apple IIGS at home then, with a 14.4 kbps modem. (Thanks Mom and Dad!).  Somewhere around the beginning of high school, maybe the year before, I discovered some local dial-up multi-user BBSs.

Probably from information on a BBS, somewhere probably around 1994, me and a friend discovered Michnet, a network of dial-up access points throughout the state of Michigan, funded, I believe, by the state department of education. Dialing up Michnet, without any authentication, gave you access to a gopher menu. It didn’t give you unfettered access to the internet, but just to what was on the menu — which included several options that would require Michigan higher ed logins to proceed, which I didn’t have. But also links to other gophers which would take you to yet other places without authentication. Including a public access unix system (which did not have outgoing network connectivity, but was a place you could learn unix and unix programming on your own), and ISCABBS. Over the next few years I spent quite a bit of time on ISCABBS, a bulletin board system with asynchronous message boards and a synchronous person-to-person chat system, which at that time routinely had several hundred simultaneous users online.

So I had discovered The Internet. I recall trying to explain it to my parents, and that it was going to be big; they didn’t entirely understand what I was explaining.

When visiting colleges to decide on one in my senior year, planning on majoring in CS, I recall asking at every college what the internet access was like there, if they had internet in dorm rooms, etc. Depending on who I was talking to, they may or may not have known what I was talking about. I do distinctly recall the chair of the CS department at the University of Chicago telling me “Internet in dorm rooms? Bah! The internet is nothing but a waste of time and a distraction of students from their studies, they’re talking about adding internet in dorm rooms but I don’t think they should! Stay away from it.” Ha. I did not enroll at the U of Chicago, although I don’t think that conversation was a major influence.

Entering college in 1993, in my freshmen year in the CS computer lab, I recall looking over someone’s shoulder and seeing them looking at a museum web page in Mozilla NCSA Mosaic  — the workstations in the lab were unix X-windows systems of some kind, I forget what variety of unix. I had never heard of the web before. I was amazed, I interupted them and asked “What is that?!?”. They said “it’s the World Wide Web, duh.”  I said “Wait, it’s got text AND graphics?!?”  I knew this was going to be big. (I can’t recall the name of the fellow student a year or two ahead who first showed me the WWW, but I can recall her face. I do recall Karl Fogel, who was a couple years ahead of me and also in CS, kindly showing me things about the internet on other occasions. Karl has some memories of the CS computer lab culture at our college at the time here, I caught the tail end of that).

Around 1995, the college IT department hired me as a student worker to create the first-ever experimental/prototype web site for the college. The IT director had also just realized that the web was going to be big, and while the rest of the university hadn’t caught on yet, he figured they should do some initial efforts in that direction. I don’t think CSS or JS existed yet then, or at any rate I didn’t use them for that website. I did learn SQL on that job.  I don’t recall much about the website I developed, but I do recall one of the main features was an interactive campus map (probably using image maps).  A year or two or three later, when they realized how important it was, the college Communications unit (ie, advertising for the college)  took over the website, and I think an easily accessible campus map disappeared not to return for many years.

So I’ve been developing for the web for 20 years!

Ironically (or not), some of my deepest nostalgia these days is for the pre-internet pre-cell-phone society; even most of my university career pre-dated cell phones, you wanted to get in touch with someone you called their dorm room, maybe left a message on their answering machine.  The internet, and then cell phones, eventually combining into smart phones, have changed our social existence truly immensely, and I often wonder these days if it’s been mostly for the better or not.

Posted in General | 1 Comment

bento_search 1.4 released

bento_search is a ruby gem that provides standardized ruby API and other support for querying external search engines with HTTP API’s, retrieving results, and displaying them in Rails. It’s focused on search engines that return scholarly articles or citations.

I just released version 1.4.

The main new feature is a round-trippable JSON serialization of any BentoSearch::Results or Items. This serialization captures internal state, suitable for a round-trip, such that if you’ve changed configuration related to an engine between dump and load, you get the new configuration after load.  It’s main use case is a consumer that is also ruby software using bento_search. It is not really suitable for use as an API for external clients, since it doesn’t capture full semantics, but just internal state sufficient to restore to a ruby object with full semantics. (bento_search does already provide a tool that supports an Atom serialization intended for external client API use).

It’s interesting that once you start getting into serialization, you realize there’s no one true serialization, it depends on the use cases of the serialization. I needed a serialization that really was just of internal state, for a round trip back to ruby.

bento_search 1.4 also includes some improvements to make the specialty JournalTocsForJournal adapter a bit more robust. I am working on an implementation of JournalTocs featching that needed the JSON round-trippable serialization too, for an Umlaut plug-in. Stay tuned.

Posted in General | Leave a comment

Am I a “librarian”?

I got an MLIS degree, received a bit over 9 years ago, because I wanted to be a librarian, although I wasn’t sure what kind. I love libraries for their 100+ year tradition of investigation and application of information organization and retrieval (a fascinating domain, increasingly central to our social organization); I love libraries for being one of the few information organizations in our increasingly information-centric society that (often) aren’t trying to make a profit off our users so can align organizational interests with user interests and act with no motive but our user’s benefit; and I love libraries for their mountains of books too (I love books).

Originally I didn’t plan on continuing as a software engineer, I wanted to be ‘a librarian’.  But through becoming familiar with the library environment, including but not limited to job prospects, I eventually realized that IT systems are integral to nearly every task staff and users perform at or with a librarian — and I could have a job using less-than-great tech knowing that I could make it better but having no opportunity to do so — or I could have job making it better.  The rest is history.

I still consider myself a librarian. I think what I do — design, build, and maintain internal and purchased systems by which our patrons interact with the library and our services over the web —  is part of being a librarian in the 21st century.

I’m not sure if all my colleagues consider me a ‘real librarian’ (and my position does not require an MLIS degree).  I’m also never sure, when strangers or aquaintances ask me what I do for work, whether to say ‘librarian’, since they assume a librarian does something different then what I spend my time doing.

But David Lee King in a blog post What’s the Most Visited Part of your Library? (thanks Bill Dueber for the pointer), reminds us, I think from a public library perspective:

Do you adequately staff the busiest parts of your library? For example, if you have a busy reference desk, you probably make sure there are staff to meet demand….

Here’s what I mean. Take a peek at some annual stats from my library:

  • Door count: 797,478 people
  • Meeting room use: 137,882 people
  • Library program attendance: 76,043 attendees
  • Art Gallery visitors: 25,231 visitors
  • Reference questions: 271,315 questions asked

How about website visits? We had 1,113,146 total visits to the website in 2014. The only larger number is is our circulation count (2,300,865 items)….

…So I’ll ask my question again: Do you adequately staff the busiest parts of your library?

I don’t have numbers in front of me from our academic library, but I’m confident that our ‘website’ — by which I mean to include our catalog, ILL system, link resolver, etc, all of the places users get library services over the web, the things me and my colleagues work on — is one of the most, if not the most, used ‘service points’ at our library.

I’m confident that the online services I work on reach more patrons, and are cumulatively used for more patron-hours, than our reference or circulation desks.

I’m confident the same as true at your library, and almost every library.

What would it mean for an organization to take account of this?  “adequate staffing”, as King says, absolutely. Where are staff positions allocated?  But also in general, how are non-staff resources allocated?  How is respect allocated? Who is considered a ‘real librarian’? (And I don’t really think it’s about MLIS degree either, even though I led with that). Are IT professionals (and their departments and managers) considered technicians to maintain ‘infrastructure’ as precisely specified by ‘real librarians’, or are they considered important professional partners collaborating in serving our users?  Who is consulted for important decisions? Is online service downtime taken as seriously (or more) than an unexpected closure to the physical building, and are resources allocated correspondingly? Is User Experience  (UX) research done in an actual serious way into how your online services are meeting user needs — are resources (including but not limited to staff positions) provided for such?

What would it look like for a library to take seriously that it’s online services are, by far, the most used service point in a library?  Does your library look like that?

In the 21st century, libraries are Information Technology organizations. Do those running them realize that? Are they run as if they were? What would it look like for them to be?

It would be nice to start with just some respect.

Although I realize that in many of our libraries respect may not be correlated with MLIS-holders or who’s considered a “real librarian” either.  There may be some perception that ‘real librarians’ are outdated. It’s time to update our notion of what librarians are in the 21st century, and to start running our libraries recognizing how central our IT systems, and the development of such in professional ways, are to our ability to serve users as they deserve.

Posted in General | Leave a comment

Virtual Shelf Browse is a hit?

With the semester starting back up here, we’re getting lots of positive feedback about the new Virtual Shelf Browse feature.

I don’t have usage statistics or anything at the moment, but it seems to be a hit, allowing people to do something like a physical browse of the shelves, from their device screen.

Positive feedback has come from underclassmen as well as professors. I am still assuming it is disciplinarily specific (some disciplines/departments simply don’t use monographs much), but appreciation and use does seem to cut across academic status/level.

Here’s an example of our Virtual Shelf Browse.

Here’s a blog post from last month where I discuss the feature in more detail.

Posted in General | Leave a comment

blacklight_cql plugin

I’ve updated the blacklight_cql plugin for running without deprecation warnings on Blacklight 5.14.

I wrote this plugin way back in BL 2.x days, but I think many don’t know about it, and I don’t think anyone but me is using it, so I thought I’d take the opportunity having updated it, to advertise it.

blacklight_cql gives your BL app the ability to take CQL queries as input. CQL is a query language for writing boolean expressions (; I don’t personally consider it suitable for end-users to enter manually, and don’t expose it that way in my BL app.

But I do it use it as an API for other internal software to make complex boolean queries against my BL app; like “format = ‘Journal’ AND (ISSN = X OR ISSN =Y OR ISBN = Z)”  Paired with the BL Atom response, it’s a pretty powerful query API against a BL app.

Both direct Solr fields, and search_fields you’ve configured in Blacklight are available in CQL; they can even be mixed and matched in a single query.

The blacklight_cql plug-in also provides an SRU/ZeeRex EXPLAIN handler, for a machine-readable description of what search fields are supported via CQL.  Here’s “EXPLAIN” on my server:

The plug-in does NOT provide a full SRU/SRW implementation — but as it does provide some of the hardest parts of an SRW implementation, it would probably not be too hard to write a bit more glue code to get a full implementation.  I considered doing that to make my BL app a target of various federated search products that speak SRW, but never wound up having a business case for it here.  (Also, it may or may not actually work out, as SRW tends to vary enough that even if it’s a legal-to-spec SRW implementation, that’s no guarantee it will work with a given client).

Even though the blacklight_cql plugin has been around for a while, it’s perhaps still somewhat immature software (or maybe it’s that it’s “legacy” software now?). It’s worked out quite well for me, but I’m not sure anyone else has used it, so it may have edge case bugs I’m not running into, or bugs that are triggered by use cases other than mine. It’s also, I’m afraid, not very well covered by automated tests. But I think what it does is pretty cool, and if you have a use for what it does, starting with blacklight_cql should be a lot easier than starting from scratch.

Feel free to let me know if you have questions or run into problems.

Posted in General | Leave a comment

Blacklight Community Survey

I’ve created a Google Docs survey targetted at organizations who have Blacklight installations (or vendor-hosted BL installations on their behalf? Is that a thing?).

Including Blacklight-based stacks like Hydra.

The goal of the survey is to learn more about how Blacklight is being used in “the wild”, specifically but not limited to people’s software stacks they are using BL with.

If you host (or have, or plan to) a Blacklight-based application, it would be great if you filled out the survey!

Posted in General | Leave a comment