Q: best practices for *simple* contributor IP/licensing management for open source?

So, like many non-huge non-corporate-supported open source projects, many of the open source projects I contribute to go something like this (some of which I was original author, others not):

  • Someone starts the project in an publicly accessible repo.
  • If she works for a company, in the best case she got permission with her employer (who may or may not own copyright to code she writes) to release it as open source.
  • She sticks some open source License file in the repo saying “copying Carrie Coder” and/or the the name of the employer.

Okay, so far so good, but then:

  • She adds someone else as a committer, who starts committing code. And/or accepts pull requests on github etc, committing code by other authors.
  • Never even thinks about licensing/intellectual property issues.

What can go wrong?

  •  Well, the license file probably still says ‘copyright Carrie Coder’ or ‘copyright Acme Inc’, even though the code by other authors has copyright held by them (or their employers). So right away something seems not all on the up and up.
  • One of those contributors can later be like “Wait, I didn’t mean to release that open source, and I own the copyright, you don’t have my permission to use it, take it out.”
  • Or worse, one of the contributors employers can assert they own the copyright and did not give permission for it to be released open source and you don’t have permission to use it (and neither does anyone else that’s copied or forked it from you).

Heavy weight solutions

So there’s a really heavy-weight solution to this, like Apache Foundation uses in their Contributor License Agreement.  This is something people have to actually print out and sign and mail in. Some agreements like this actually transfer the copyright to some corporate entity, presumably so the project can easily re-license under a different license later. (I thought Apache did this, but apparently not).

This is kind of too much over-head for a simple non-corporate-sponsored open source project. Who’s going to receive all this mail, and where are they going to keep the contracts? There is no corporate entity to be granted a non-exclusive license to do anything. (And the hypothetical project isn’t nearly so important or popular to justify trying to get umbrella stewardship from Apache or the Software Freedom Conservancy or whatever.(If it were, the Software Freedom Conservancy is a good option, but still too much overhead for the dozens of different tiny-to-medium sized projects anyone may be involved in. )

Even so far as individuals, over the life of the project who the committers are may very well change, and not include the original author(s) anymore.

And you don’t want to make someone print out sign and wait for you to receive something before accepting their commits, that’s not internet-speed.

Best practices for a simpler solution that’s not nothing?

So doing it ‘right’ with that heavy-weight solution is just way too much trouble, so most of us just keep ignoring it.

But is there some lighter-weight better-than-nothing probably-good-enough approach?  I am curious if anyone can provide examples, ideally lawyer-vetted examples, of doing this much simpler. 

Most of my projects are MIT-style licensed, which already says “do whatever the heck you want with this code”, so I don’t really care about being able to re-license under a different license later (I don’t think I do? Or maybe even the MIT license would already allow anyone to do that). So I definitely don’t need  and can’t really can’t handle paper print-outs.

I’m imagining something where each contributor/accepted-pull-request-submitter basically just puts a digital file in the repo, once,  that says something like “All the code I’ve contributed to this repo in past or future, I have the legal ability to release under license X, and I have done so.” And then I guess in the License file, instead of saying ‘copyright Original Author’, it would be like ‘copyright by various contributors, see files in ./contributors to see who.’

Does something along those lines end up working legally, or is it worthless, no better than just continuing to ignore the problem, so you might as well just continue to ignore the problem?  Or if it is potentially workable, does anyone have examples of projects using such a system, ideally with some evidence some lawyer has said it’s worthwhile, including a lawyer-vetted digital contributor agreement (that is itself licensed such that someone else can re-use it, ha)?

Any ideas?

11 thoughts on “Q: best practices for *simple* contributor IP/licensing management for open source?

  1. I wouldn’t suggest imagining new things when it comes to legal issues ;)

    I would suggest considering the Developer’s Certificate of Originality (DCO) process as adopted by the Linux project and others (including Evergreen). When Evergreen was in the process of joining the Software Freedom Conservancy, that process was considered acceptable practice (IIRC, the Software Freedom Law Center did take a glance) – no doubt in part because it is a well-established practice. And talk about lightweight: using the git Signed-off-by tag indicates that you’ve read the DCO and agree to its terms.

    For a recent discussion and description of the DCO (in the context of the Project Harmony discussions which were focused primarily on the much heavier-weight CLA processes), see http://lists.harmonyagreements.org/pipermail/harmony-drafting/2011-August/000099.html for example.

    See also http://lists.harmonyagreements.org/pipermail/harmony-drafting/2011-August/000103.html where James Bottomley asserts: “Now the legal advice to the kernel process is that email sign-off
    signified by the Signed-off-by: tag is sufficient for this purpose, but that doesn’t preclude a whole range of other mechanisms for collecting sign-offs.”

    Note that I am not a lawyer and this does not constitute legal advice!

  2. Thanks a lot Dan! i wasn’t looking to re-imagine something, I just hadn’t found this example in my googling, I was only finding other ‘heavier weight’ examples.

    The DCO looks like a good model.

    I’m still having trouble finding documentation of the actual process used by Linux (Or Evergreen, or anyone else), what people to do indicate and archive their sign-off of the DCO. But now that I have some more things to google for, I’ll keep looking.

  3. Ah, I’m finding some hints that the linux process involves adding lines to ‘patches’ — most of the open source projects I’m involved with don’t work by passing around patches, but by direct commits by developers or pull requests. Still, it can probably be adapted, but I’m having trouble finding docs for contributors of _exactly_ what the process entails.

    Are there any docs like that for Evergreen, for contributors, explaining what they need to do to comply with the DCO process?

    The actual text of the DCO looks perfect for using, just need to figure out the right workflow/process for registering sign-offs.

  4. Ah, and as I continue spamming my own blog, I did find this for evergreen:

    http://open-ils.org/documentation/contributing.html

    In the case of brand new algorithmic or feature additions, include the DCO 1.1 in your final patch submission email. In order to make sure that no IP-protected code ever leaks into the Evergreen repository, we will need to have a DCO for all major contributions. This is not an assignment of copyright, nor an accusation of theft. It simply states that the code you submitted is yours to contribute, is unencumbered to the best of your knowledge, and that you are free to submit it without any restrictions, such as academic or employment ties. You can always find a copy of the DCO 1.1 here.

    Okay, so evergreen development does work by passing around patches, and you have people actually include the entire text of the DCO with each patch?

  5. Given that git branches = one or more patches applied via commits, I guess I could say “Yes, we pass around patches” – but we primarily do that by pointing to a branch and issuing a merge request. We’ve made the transition from a Subversion + patch-based workflow to a git-based primarily branch-based workflow and some of our process documents are lagging.

    FWIW, we have submitted a request for review of the submission process so I suspect some of this will be ironed out further in the near future.

    FWIW, the Linux Foundation process is at http://www.linuxfoundation.org/content/how-participate-linux-community-0 (section 5.4).

  6. Okay, cool. So what’s your actual procedure right now for people filling out the DCO?

    The Evergreen guidelines I found still say “include the DCO 1.1 in your final patch submission email.”

    But if you guys aren’t doing “patch submission emails” anymore, you’re using git… what are you doing to have committers file the DCO, how does it get registered/filed that they’ve agreed to it?

    The Linux submission guidelines also seem to be written assuming code will be contributed via an email with a patch file, and tell people to do the same, include the DCO in their email. (Or it’s kind of ambiguous if you have to include the entire DCO, or just the line “Signed-off: Your Name.” Earlier emails about the DCO process in Linux I found implied to include the entire DCO, I think, as the Evergreen docs currently say to too.)

  7. For Linux, the Signed-off-by: tag is all you need. I’ll reiterate my quote from James Bottomley: “Now the legal advice to the kernel process is that email sign-off signified by the Signed-off-by: tag is sufficient for this purpose”.

    That’s our intent in Evergreen as well, as suggested by “For all significant code contributions, please use git’s sign-off feature to assert that the code you are submitting is in accordance with the Developer Certificate of Origin (DCO) 1.1”.

    Of course, we go on to confuse things by talking about including the DCO in patches but that’s a legacy of the older Subversion process that, as noted previously, should get cleaned up soonish.

    Anyways: this is why I suggested the git sign-off == DCO assertion process was a lightweight alternative.

  8. I’m still confused. What do you guys actually do right now?

    Ask people to put “Signed-off: My Name” in every git commit message? Do I have that right?

    Oh wait, there’s a git “sign off feature” I was not previously familiar with? I’ll go googling on that…. Huh, that’s what you guys use? Sorry, I was confused cause I didn’t know that feature existed.

    It seems odd to me that a simple “signed off” tag is legally sufficient, without any indication of what they were signing off on! But if that’s really what linux kernel has decided is good enough….

  9. Huh, as far as I can tell, the git “–signoff” param still only applies to actual patch files sent around, it’s not an argument for “git commit”… if you’re actually comitting directly to the repo (or comitting directly to another repo and then someone else pulls) instead of sending around .patch files generated by git…. I can’t quite figure out…

    Hmm, documentation to the contrary, it looks like you can do `git commit –signoff`.

    So that’s what you guys do, just that, –signoff flag on all commits? Sorry if I’m being dense, just trying to figure out the details here.

    On Thu, Dec 15, 2011 at 5:33 PM, Jonathan Rochkind wrote: > I’m still confused. What do you guys actually do right now? > > Ask people to put “Signed-off: My Name” in every git commit message? > Do I have that right? > > Oh wait, there’s a git “sign off feature” I was not previously > familiar with? I’ll go googling on that…. Huh, that’s what you guys > use? Sorry, I was confused cause I didn’t know that feature existed. > > It seems odd to me that a simple “signed off” tag is legally > sufficient, without any indication of what they were signing off on! > But if that’s really what linux kernel has decided is good enough…. > > >

  10. Yes. “git commit -s” is what we use in probably 99% of cases now.

    http://git.evergreen-ils.org/?p=Evergreen.git;a=log

    The second “Signed-off-by:” tag in a given commit means “I’ve reviewed & tested the patch and found it good”. Which I personally worried about being confusing at first, but have been won over because it’s a hell of a lot easier to do “git cherry-pick -s” and add your sign-off than to manually amend the commit to include “Reviewed-by:” / “Tested-by:” tags.

  11. Okay, cool, this is helpful. It looks like git -s does no more or less than automatically add “Signed-off-by: name email” to the end of the commit message.

    It’s still mystifying to me that this is good enough, since it contains no indication of what “signed off by” means, nothing explicit relating it to the DCO. But hey, if it’s good enough for Linux.

    The remaining oddity to me is that the DCO makes reference to “the open source license indicated in the file”, assuming every source file will indicate an open source license. For Linux, in fact I think every source file does point to the GPL and include a copyright statement too. But for Evergreen, much like for my own projects, this does not seem to be the case.

    Also, the “-s” tag works for ongoing commits, but what about all the source I already have in my open source projects before I hypothetically started doing this?

    Here’s what I’m contemplating doing for Umlaut. Umlaut’s only got about 6 committers in it’s history. I’m thinking of creating a ./contributors directory in source. Have everyone that’s committed code so far commit a copy of the DCO with their name on it to that directory. Get future committers or people I’m pulling from to do that. The end. Every git commit already has a name and email attached to it, even without “-s”. It seems like that name and email, combined with the ‘signed’ DCO in source, where that signed DCO in source was committed by the very same name/email, that seems just as good. The DCO already says “By making a contribution to this project, I certify that…”, okay, good. Maybe modify the DCO _slightly_ to not say “open source license indicated in file”, but make reference to the project license, or even include it inline (the MIT-style license is short enough to do that).

    I suppose if the given project ever got big enough to make that unwieldy, it could switch to the “git -s” approach for commits going forward, at that point.

    That’s what I’m thinking. Not saying anyone else should do that, but it seems possibly the right approach for my projects. Definitely having the DCO instead of having to come up with text from scratch is a huge win, I hadn’t known about that before, thanks. (Presumably since it comes out of Linux, the DCO itself is licensed such that other people can use it, including modify it, one hopes!)

Leave a comment