Peter Murray has written a great post analyzing some of the complex issues of copyright and our bibliographic descriptions.
I wrote a comment on that post which turned huge, so I post it here too as it’s own thing..
This is good stuff, thanks Peter.
I’d offer some clarifications, or maybe they’re actually whatever the opposite of clarifications are. Murkifications?
Controlled vocab assignment
I think it’s actually pretty unclear whether things like controlled subject and classification assignment are sufficiently creative to have a copyright. Maybe, maybe not. Rather than assuming they are.
When it comes to original abstracts (not extracts) like we get from the publisher, it is more clear that there is copyright created. I don’t think such abstracts are typically found in our catalog records with the exception of those we get from publishers, I don’t think catalogers are typically writing such things.
Who holds copyright?
If there is a copyright, it isn’t neccesarily held by the ‘person’ doing the copyrighting. If you create copyrighted work as a regular employee, the copyright belongs to your employer. If you’re a contractor, it could go either way, depending on your contract. If you work for the federal government though, neither you nor your employer has copyright in the US, the federal government has no copyright in the US (according to the Copyright Act).
When we get to data (or ‘content’) created by machine–this is a new topic I hadn’t even considered before. But is also somewhat murky. I suspect it’s unlikely that someone created by a machine could possibly have sufficient ‘creativity’ or ‘originality’ to have copyright! The code actually doing the work, which was written by humans, definitely has copyright (and possibly patent) protection. But it’s output? I doubt it. That it was created mindlessly by a machine would seem to ipso facto mean it’s not ‘creative’.
How about some real lawyers’ opinions?
But I’m still not a lawyer. All this stuff is murky. This would be a GREAT topic for a regional consortium or library organization to have it’s legal counsel write a memo on, similarly to the great analysis that I think I remember seeing of Google Books agreement from ACRL counsel (or was it ARL?), but I can’t find now. (Anyone else know what I’m talking about?)
I hope that such consortiums and organizations aren’t scared that even having counsel write such an analysis would be considered threatening by OCLC, and thus avoid it for that reason. We could use it.
It’s interesting the approach the PDDL takes to all of this. The PDDL is specifically written to say, basically, “We aren’t sure what rights we have in this content/data, accross various legal jurisdictions. But no matter what rights we have, we relinquish ALL of them, to the extent we are able to by law, and put it in the public domain. And if there any we aren’t allowed to relinquish by law, we grant you a free license to them. ”
So it’s written to avoid the whole legal mess, and say, no matter what, what we WANT to do is make this open access, and no matter what rights we have, we can do that, so, go.
That’s one reason the PDDL, unlike the CC licenses, isn’t “some rights reserved”, but “no rights reserved”—when you don’t even know what rights you have in the first place, it’s hard to reserve some of them, but easier just to say, whatever they are, I relinquish them.
( Incidentally, that issue makes LibraryThing’s practice of having user’s attach CC licenses to facts about books rather suspect. If those facts aren’t copyrightable in the first place, then the CC license doesn’t do anything. The CC license is based on the idea that the originator possesses copyright, and is using copyright to allow only certain uses, and enforce certain restrictions. IF there is no copyright in the first place, the CC license is unenforceable. )
So anyway, PDDL. I think it is in libraries best interest to put ALL data they originate under the PDDL, and to try and only enhance records with data similarly released under the PDDL. If we want to just release the data we originate, the exact legal details don’t neccesarily matter, the PDDL can release them no matter what. It’s when we want to restrict our data that the exact legal details matter, as to whether we an. (The legal details may matter when we want to use data we did NOT originate too, of course).
Unfortunately, it’s still not exactly clear how to legally apply the PDDL to our complicated databases that may have records from multiple sources, and an individual record may have data/content from multiple sources at multiple times. See the threads in the recently created ODC-discuss list.
I think it should be a priority to clarify how to legally apply the PDDL to library data. Another fine thing for consortial or national organizational counsel to apply themselves to.
But it’s still unclear to me if our actual decision-makers and administrators care about these issues yet. They ought to.