thoughts on ‘kits’, and thinking in terms of modern metadata practices

From a thread on the RDA listserv, discussing how to enter the things we’ve called ‘kits’ under RDA.   The general consensus seemed to be that RDA supported entering the content/carrier/genre of each constituent part, but didn’t neccesarily support a top-level aggregate content/carrier/genre entry for the ‘kit’.  Karen Coyle made a comment that I interpreted (perhaps incorrectly) as suggesting a top-level content/carrier/genre entry was required, and the individual constuent part ‘analytics’ might not be. (Again, I may have misinterpreted what Coyle was suggesting, so take this as a hypothetical for-the-sake-of-argument position, not neccesarily Coyle’s).

My response, challenging this, which I think also perhaps provides an example of how to think about metadata entry in terms of modern metadata practices, and in terms of how our systems will be using the metadata, and how we want them to.

On 9/10/2011 4:24 PM, Karen Coyle wrote:

I think the user probably wants a single expression that gives her an idea of what the resource is. I’m not convinced that the 366/7/8 separation and terminology supports a real function, so I’d like to hear from folks with more knowledge about what functionality it was designed for. I am presuming (maybe too optimistically) that the developers of RDA (and ONIX) had some actual use in mind.

‘Analytic’ constituent part entry seems definitely useful to me

So it _seems_ pretty clear to me that it’s useful to have ‘analytic’ entry of the carrier/form/content types of the contained items.

If there’s a kit containing a DVD, and the kit is under a certain topic, mightn’t a user want to find that when looking for DVD’s on that topic?  If some kits contain flash cards and some don’t, mightn’t a user want to find just the ones that contain flash cards, or want to find all flash cards in the system, whether in a kit or cataloged seperately?

So it seems pretty clear to me that the ‘analytic’ encoding is useful. But of course some actual user studies would be valuable.

On the other hand, I agree we (to serve our users) also want to identify the aggregate resource that is presumably being cataloged as an aggregate whole for some reason (like it was intended by the creator/publisher to go together).

Is top-level aggregate format/content/carrier/genre/’type of thing’ entry neccesary?

Now if the record simply contains multiple 336/337/338 of different carrier types, then the record already contains enough meaning for the system to know exactly that — that this item consists of an aggregation of several things with different carrier types/media/etc.   And any such things could be presented to the user in searches/filters/displays as “Kit” or whatever other user-understood term is appropriate.

Whether any of our systems _will_ do this any time soon is sadly another question — but I think it’s generally a huge mistake to create data in inconsistent or redundant (redundant means more expensive for the cataloger to enter) ways to cater to 20 year old crappy systems — it locks us into the crappy environment forever. So let’s assume for the sake of argument, for now, that systems will do this if it is appropriate. Perhaps there could be suggestions, whether in RDA or a subsidiary document, for exactly what algorithm systems should use to look at the 336/337/338 and decide if it should be understood as a ‘kit’. (More than one, not both of the same carrier type?).

There’s still a question of whether this is sufficient — is it sufficient to say that _anything_ with more than one 336 not of the same type is a “kit”, or is whether something is a “kit” instead something that can only be determined by a judgement of creator intent, whether the items were intended to go together in a particular kit-like way by the creator?  If the latter, that it can only be determined by a judgement of creator intent,  then indeed a seperate coding of the aggregate would need to be in the record somehow.  (This would be even more clear if the high level description we wanted was “instructional kit” specifically — that can’t be infered merely by seeing that the thing includes a DVD and a workbook, some things including a DVD and a workbook are ‘instructional kits’, some things aren’t).  If the former, that anything with multiple items of different carriers in it can be considered for our user community’s purpose as a “kit”, then no additional coding is neccesary.

The 336/337/338 content/media/carrier-type ontology

As another topic, if what we’re asking is whether the seperation into the triple of 336 337 and 338 makes sense in the first place (whether for a single item, or an aggregate ‘box o stuff’) — I think the answer is that it turns out to be _very_ difficult to develop an ontology/vocabulary/terminology for what turns out to the complicated and context-sensitive notion of content/carrier/genre/form/format/type/whatever-you-call it.  Our users own notions of these things are _not_ consistent, and are _very_ context and community dependent. 

But if we give up on being consistent and just throw terms into a giant grab bag of form/format/genre/carrier — well, that’s pretty much what we had with MARC GMD/SMD (I say MARC and not AACR2 intentionally here — AACR2 doesn’t even mention these!  a seperate problem is only our _encoding format_ standard mentions this data element!) — and it ended up just turning into a mess which made it very difficult for systems to serve users well, especially in non-typical contexts. 

I think what RDA decided was we should come up with as consistent and rational an ontology as possible for form/format/genre/etc, and once encoded rationally, different systems could take this data and slice, dice, recombine, and display them differently as appropriate for the context or user community. Something impossible to do if you go with the “irrational/inconsistent grab bag approach.” On the other hand, wit the “rational consistent ontology which system can display how it wants” approach, you pretty much need the system to do some calculation/slice-and-dice before displaying, it’s not meant to be display as-is, which is throwing some people for a loop, since they’re used to GMD/SMD meant to be displayed to users using exactly the terminology entered in the MARC source (and not really meant to be searchable/sortable/collocatable at all, it was designed purely as a display field, which is part of the problem — but as soon as you want to design for filter/collocation, well that’s when you really need a rational/consistent ontology, not the grab bag approach).

Anyhow, I I think RDA made the right choice (although reasonable people can certainly disagree on the right choice, personally I think reasonable people will all agree that RDA’s choice is at least defensibly reasonable, heh ) — and that the 336/337/338 content/media/carrier three-facet ontology is as complete, flexible, and consistent an ontology as I’ve seen anywhere for this stuff, I think whoever came up with it did a good job of analysis there. 


One thought on “thoughts on ‘kits’, and thinking in terms of modern metadata practices”

  1. Finally a positive piece on RDA! Yes, metadata creators should move away from wanting/expecting to see exactly what they entered in their users’ discovery tool (the common catalogue). We might need to sell it to those reluctant to let go of GMD/SMD by reminding them that if you use terms (even codes) that the machine can process and use them consistently(!) then we can save on putting in the description that are for display only. A win-win situation for both the display and search.
    Also, nice to read something detailed and with practical background. Thanks!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s