My Cataloging/Metadata Credo

I think our current metadata environment is seriously and fundamentally broken in several ways.

I do NOT think the solution lies in getting rid of everything we’ve got, or in nothing but machine-analysis of full text. I think the solution requires continual engagement by metadata professionals, which will be continually needed. We will always need catalogers—that is, metadata professionals involved in the generation and maintenance of metadata. Because that’s what catalogers are and have always been. Continue reading “My Cataloging/Metadata Credo”


‘Access Points’ as Identifiers

An essay I originally posted to rda-l on 14 February 2007, and put here now mainly to have a persistent URL to easily access it. I made a few minor edits for clarity while I was at it. (So this is perhaps a new Expression of my essay, if you’re keeping track).

“Access points” as “Textual identifiers” ?

Continue reading “‘Access Points’ as Identifiers”

ruby trick question

Okay, back to nuts and bolts programming.

Can anyone explain exactly what’s going on when ruby does, like “20.minutes.ago”. I mean, #minutes must be a method on numeric values, right? So why can’t I find it included in the rdoc for Integer or Numeric? And then #minutes  returns some kind of object that has an #ago method. So, um, what kind? I don’t get it. I like to understand what’s going on.

Future of Bibliographic Control

I’m finding that the LC hearings of the Working Group on Bibliographic Control are producing some very valuable discussion. I hope that the report the working group ends up producing will be equally valuable–and I hope that somehow, it can actually effect our discourse and practice, instead of just disappearing into a black hole as most similar contributions over the past 15 years seem to.

In the meantime you can, and I highly encourage you to, read Mark Linder’s notes on the meeting, as well as Diane Hillman’s.

Serials Coverage: Z39.71 vs. ONIX Coverage

Serials Coverage

I have an issue I’d like to put on the radar of ILS developers generally, especially open source ILS developers, especially apropos since the Evergreen Serials module is in the process of being developed.

When trying to integrate my Link Resolver with my ILS recently, I wanted to accomplish a task that seems like you’d often want to accomplish: When given a particular journal citation (say, issn, volume, and issue), identify if we have it in print, and identify the particular ILS record(s) that correspond to that serial holding in print.

In our environment, this turned out not to be possible to do in a reasonably confident way. Part of the problem is the Z39.71 standard, which is used to express serials coverage/holdings in a human readable format. While z39.71 holdings statements are theoretically intended to be consistent and maybe even machine-processable—anyone who has tried to machine process them will have discovered they aren’t really suitable for recovering the sort of information needed to perform my task, for example.

On top of that, in many actual ILS environments, catalogers end up entering z39.71 purely by hand. I don’t know if there is even a way to validate z39.71 holdings statements automatically (I suspect there is not, an obvious problem in itself), but I’d guess that in a typical environment around half of z39.71 statements in a corpus are probably not strictly legal z39.71. Whether through typo, cataloger misunderstanding of the standard, or simply lack of concern with following the standard I don’t know, probably a different mix in different institutions.

Continue reading “Serials Coverage: Z39.71 vs. ONIX Coverage”


And other impenetrable acronyms.

I share the generalized optimism toward the recent announcement of the DC/RDA joint project.

It’s confusing to talk and think about these sorts of ideas, because to talk about metadata like this, you need to talk so very abstractly. We try to mean very precise things, but we don’t always have the precise words to describe them, or to be understood by people who may not mean the same things by the same words.

I’ve been confused by the DCAM for a while, myself. As I keep circling around and around trying to understand what’s going on, at this particular stage in my circling I’ve found this paper, Towards an Interoperability Framework for Metadata Standards, by Nilsson et al, to be very helpful, and I think I’m getting closer to understanding what DCAM is. When I go and look at my comments made to Pete Johnston’s blog post, linked above, around five months ago, already I wouldn’t ask those same questions now (although I can’t exactly answer them in clear language either–so tricky to talk about this stuff!)

I do start to wonder, though: Is DCAM trying to solve the _exact_ same problem RDF is? Is there any reason to have both? What does DCAM have that the “RDF suite” does not? Nilsson et al do say that “The RDF suite of specifications, however, follow a more similar pattern to the framework presented here.”

Erik Hatcher on what’s needed

Erik Hatcher’s essay on their experiences prototyping blacklight at UVa ought to be required reading for anyone interested in the future of library digital services.

To my mind, the most important point he makes is this:

Let me reiterate that what I see needed is a process, not a product. With Solr in the picture, we can all rest a bit easier knowing that a top-notch open source search engine is readily available… a commodity. The investment for the University of Virginia, then, is not in search engine technology per se, but rather in embracing the needs of the users at a fine-grained level.

This is a point I see many library decision makers not fully grasping. It’s not about buying a product (whether open source _or_ proprietary), it’s about somehow getting multiple parts of the libraries on board in a coordinated effort to focus our work where it matters. The tech may make this possible for the first time–and some tech may be better than other tech–but tech can’t solve things for you. Just plunking your money down for the ‘right’ product from a vendor (Yes, even if that vendor is OCLC!) can not be an end point.

But organizational strategy is a lot harder than just buying an expensive product, unfortunately.

Browser rendering of mixed language chars?

As I try to set up my OPAC to display non-roman chars (what the librarians call ‘vernacular’, which seems odd to me, I’ll stick with ‘non-roman’), I have run into a really weird thing with the way browsers are displaying text that has mixed Roman and Hebrew characters. This doesn’t seem to effect any other non-roman chars we have, I can only guess it’s somehow related to the right-to-left-ness of Hebrew, but it’s weird.

Check out my simple reproduced test-case, and see if you can tell me what the heck is going on. Any advice appreciated. (Even once I figure out what’s going on, and even if I can identify a fix, there’s no telling if I can get my OPAC to act that fix. But anyway.)

See my reproduced simple demonstration:

Issues with my SFX in HIP code

Specifically with the stuff that generates the coverage strings missing from the SFX API. In some cases of SFX including services from ‘related’ SFX objects, my current code will generate empty or incorrect coverage strings that do not match what the SFX menu itself provides.

I am trying to come up with a very hacky workaround to this. If you want the new version, contact me in a couple days (I should REALLY put this stuff in a publically available SVN). But a VERY hacky workaround is what it will have to be, and will probably still have some edge cases it can not deal correctly with. To explain what the deal is… it’s confusing to explain, but I’ll just give you my Ex Libris enhancement request:

Continue reading “Issues with my SFX in HIP code”