Jonathan, you say “our current metadata environment is seriously and fundamentally broken in several ways”. What are the ways in which it is broken? I would say the cataloguing community have just been overtaken by a tsunami of change in the last ten years (mainly the shift to digital information) and is still working out how best to respond and adapt.
I suspected someone would ask that of me after the last post. A definitive argument/explanation for why/what is broken in our current environment has yet to be written, and is not an easy thing to do. All I can do is provide a sketch of some notes toward that thesis, which I’ll try to do here.
1. The issues brought up in the LC working group’s Users and Uses meeting are one good place to start for some overall background. Karen Coyle provides a good summary that includes some of the issues.
2. There is far too much duplication of labor among working catalogers. We lack a good technical and customary infrastructure for efficiently sharing corrections and improvements made in one location with the larger community. We fail to take advantage of as much work as we could for the larger good, and have significant resources being spent on duplicating data.
3. There are very basic questions of high interest to our users that our data set is unable to answer, even though we are spending time recording information that ought to be available to answer these questions. One very good example–and it’s just one example–is Roy Tennant’s analysis of the inability to say whether full content is available online even though we are already spending time recording URL information.
- We do not spend nearly enough time investigating and identifying and working to solve these sorts of problems. Why did Roy have people on AUTOCAT telling him this problem was clearly imaginary, and didn’t exist?
4. We have drawn a wall around what is and what is not of interest to ‘cataloging’ that is not neccesarily backed up by any good rationale. Many things that we decide are not of interest (like the above issue?) are in fact of high significance to the success and ease our users will have in carrying out the tasks we mean to support. We do this even within the data found in a MARC record, and also according to type of material and source of data. I don’t mean that “Catalogers” need to apply the exact same standards to journal articles, institutional repository metadata, data from Lorcan’s other three sources of metadata (thanks Peter). But we do need to consider it our responsibility to figure out how all these things can fit together. Cataloger’s need to be metadata professionals stepping up to figure out the overall control regime that can fit these things together.
- We need to think seriously about how we will share our metadata with other communities and vice versa.
- As an aside, a “pet peeve” that actually isn’t a “peeve” at all, it’s a serious problem, is the MARC-8 character encoding.
5. Related, we have too many different standards, controlled vocabularies, standards bodies, organizations, sub-communities with overlapping domains and which produce un-harmonized data, without enough coordination. One example of the problems this causes is form/genre information. Form/genre is of high interest to our users. And it is found in at least half a dozen places in the MARC record, from at least three different controlled vocabularies from three different places—LCSH $v information; GMD/SMD; and MARC allowed coded values and guidance from MARC itself (which does count as a controlled vocabulary!). How can we help users find what they need and understand what they’ve found (see facetted browsing) in terms of form/genre from this mish-mash?
- To be clear, form/genre is conceptually a very difficult problem. Although it may seem simple to the users (“I just want to find videos(/biographies/science fiction)! What’s the problem?”), we all know that it’s a conceptually thorny set of concepts that are difficult to deal with systematically. That’s no excuse for not working on it though, and the apparatus we have in place instead binds us in inertia.
That’ll do for a start. Something deserves to be said more generally about creating data that’s of use to machine processing (for the end goal of presenting things to users in better ways, naturally! We don’t care about the machines for the sake of machines) as well as for direct human consumption (Human finds record somehow->what we record has to be intelligible to human once found). But I’m still working out how to say/justify that clearly for an audience that doesn’t already agree with it.
Now, these are some very difficult problems. That we have them is not indication that 100 years of cataloging practice has “failed”. In fact, the metadata system/environment we have now was very intelligently optimized for the social, economic, and technical context of the mid 20th century. It is arguably the best that could be done in that context. But that’s not the context we are in anymore. We have new demands and new possibilities and new challenges. Yes, “the cataloguing community have just been overtaken by a tsunami of change in the last ten years” (although I’d say it’s not just about the fact that information resources are increasingly digital form. That’s in fact less significant to me than the change from card catalog to online environment, which I think we still haven’t made successfully–and that’s going on 20, 25 years.) The result is a broken system.
In the 21st century, our library metadata environment (by which I mean the interacting system composed of people, institutions, organizations, rules, standards, data sets, computer software–“system” in the sense of General Systems Theory, I don’t just mean “system” in the sense of “Systems Department”)–is in fact, I still argue, broken.
It is the role of a professional and strong community of catalogers to work on fixing it. Don’t forget that Lubetzky , Cutter, Panizzi—all were in fact “cataloging radicals” challenging and rethinking how things had always been done for new social, economic, and technical contexts. Where is our Lubetzy for the 21st century?
 “Unfortunately, standard rules had become too much of a good thing. An undue proliferation of rules was the topic of “Crisis in Cataloging” as identified by the Librarian’s Committee of 1940 at the Library of Congress and immortalized by Andrew Osborn, one of the members of the Librarian’s Committee, in 1941.
“The Library of Congress together with ALA took the lead to examine the rules, and Seymour Lubetzky was hired to discover ‘Is this rule neccessary?’ usually answering, ‘no’. Catalogers had become too focused on creating the perfect record according to LC standards, which they also complained not even LC had achieved.”
From “Cooperative Cataloging: past, present and future”, by Barry B. Baker. “Has also been published as Cataloging & classification quarterly, volume 17, number 3/4 1993”–T.p. verso. Found by me via a Google search.