yet more on copyright and cataloging

Peter Murray has written a great post analyzing some of the complex issues of copyright and our bibliographic descriptions.

I wrote a comment on that post which turned huge, so I post it here too as it’s own thing..

This is good stuff, thanks Peter.

I’d offer some clarifications, or maybe they’re actually whatever the opposite of clarifications are. Murkifications?

Controlled vocab assignment

I think it’s actually pretty unclear whether things like controlled subject and classification assignment are sufficiently creative to have a copyright. Maybe, maybe not.  Rather than assuming they are.


When it comes to original abstracts (not extracts) like we get from the publisher, it is more clear that there is copyright created. I don’t think such abstracts are typically found in our catalog records with the exception of those we get from publishers, I don’t think catalogers are typically writing such things.

Who holds copyright?

If there is a copyright, it isn’t neccesarily held by the ‘person’ doing the copyrighting. If you create copyrighted work as a regular employee, the copyright belongs to your employer. If you’re a contractor, it could go either way, depending on your contract. If you work for the federal government though, neither you nor your employer has copyright in the US, the federal government has no copyright in the US (according to the Copyright Act).

Machine-created content/data?

When we get to data (or ‘content’) created by machine–this is a new topic I hadn’t even considered before. But is also somewhat murky. I suspect it’s unlikely that someone created by a machine could possibly have sufficient ‘creativity’ or ‘originality’ to have copyright! The code actually doing the work, which was written by humans, definitely has copyright (and possibly patent) protection. But it’s output? I doubt it.  That it was created mindlessly by a machine would seem to ipso facto mean it’s not ‘creative’.

How about some real lawyers’ opinions?

But I’m still not a lawyer. All this stuff is murky.  This would be a GREAT topic for a regional consortium or library organization to have it’s legal counsel write a memo on, similarly to the great analysis that I think I remember seeing of Google Books agreement from ACRL counsel (or was it ARL?), but I can’t find now.  (Anyone else know what I’m talking about?)

I hope that such consortiums and organizations aren’t scared that even having counsel write such an analysis would be considered threatening by OCLC, and thus avoid it for that reason. We could use it.


It’s interesting the approach the PDDL takes to all of this. The PDDL is specifically written to say, basically, “We aren’t sure what rights we have in this content/data, accross various legal jurisdictions. But no matter what rights we have, we relinquish ALL of them, to the extent we are able to by law, and put it in the public domain. And if there any we aren’t allowed to relinquish by law, we grant you a free license to them. ”

So it’s written to avoid the whole legal mess, and say, no matter what, what we WANT to do is make this open access, and no matter what rights we have, we can do that, so, go.

That’s one reason the PDDL, unlike the CC licenses, isn’t “some rights reserved”, but “no rights reserved”—when you don’t even know what rights you have in the first place, it’s hard to reserve some of them, but easier just to say, whatever they are, I relinquish them.

( Incidentally, that issue makes LibraryThing’s practice of having user’s attach CC licenses to facts about books rather suspect. If those facts aren’t copyrightable in the first place, then the CC license doesn’t do anything. The CC license is based on the idea that the originator possesses copyright, and is using copyright to allow only certain uses, and enforce certain restrictions. IF there is no copyright in the first place, the CC license is unenforceable. )

So anyway, PDDL.  I think it is in libraries best interest to put ALL data they originate under the PDDL, and to try and only enhance records with data similarly released under the PDDL.  If we want to just release the data we originate, the exact legal details don’t neccesarily matter, the PDDL can release them no matter what. It’s when we want to restrict our data that the exact legal details matter, as to whether we an.  (The legal details may matter when we want to use data we did NOT originate too, of course).

Unfortunately, it’s still not exactly clear how to legally apply the PDDL to our complicated databases that may have records from multiple sources, and an individual record may have data/content from multiple sources at multiple times. See the threads in the recently created ODC-discuss list.

I think it should be a priority to clarify how to legally apply the PDDL to library data. Another fine thing for consortial or national organizational counsel to apply themselves to.

But it’s still unclear to me if our actual decision-makers and administrators care about these issues yet. They ought to.


4 thoughts on “yet more on copyright and cataloging

  1. What about the Dewey System or the BISAC Subject headings? It is a controlled vocab but OCLC / BISG claim copyright on it?

  2. Good point. Someone can have copyright on the controlled vocabulary itself. That may mean that you need their permission to use the controlled vocabulary at all.

    But that’s different than having copyright on the assignment in the individual record. The creation of a controlled vocabularly is a creative act, and OCLC can have copyright over Dewey. The decision to assign a particular class number to a particular work, resulting in a field in aparticular record with a Dewey number in it — may or may not be. And if it is, it’s not the Controlled Vocabularly owner that would have copyright over that field, it’s the person who made the decision.

    Except it gets even more complicated. You need the vocabulary owners permission to use the vocabulary at all. Do you need the vocabulary owner’s permission to then share the record that you’ve used the vocabularly in, as a result of their copyright of the vocabulary as a whole?

    I have no idea. The particular terms of the license by which OCLC or BSIG license people to use Dewey/BISAC might be instructive. Oddly, I don’t think most libraries who use Dewey for cataloging actually HAVE a formal license from OCLC, do they?

    Very confusing, you got me.

    In the end, I remain positive that we are NOT served by people trying to exersize intellectal property control over these things.

  3. Johnathan,

    Could you add the link to the Peter Murray post that you refer to?



    6:40pm EST: Sorry, neglected and corrected. Thanks. -jrochkind

  4. Thank you for your thoughtful commentary on my text, Jonathan. I was assuming that the assignment of subject headings and classification numbers is a creative act; I don’t know of any legal opinions on the subject matter. I should have made that clearer. In the interest of time, I decided not to expand on the fact that copyright is probably held by the employing organization since it would be considered “work for hire” — you are correct on that point too.

    Interesting thoughts about the ability to copyright the output of computer algorithms. I’ll admit that I don’t know for sure about the copyright implication of algorithm outputs, but I was thinking about Mandelbrot set pictures as something that would seem to fall under copyright. Perhaps they don’t either.

    In any case, you are right that we need pointers to or generation of real legal opinion on this matter.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s