People have been suggesting this ever since Copilot was announced and it doesn't work on any level. They're using all code on GitHub, even the ones with no license and which you can't use for any purpose and the reasoning is that they see it as fair use - which supersedes any licenses and copyrights in the US.
> since they’re providing a way of consciously allowing probable copyright infringement
I think the claim that it's "probable copyright infringement" is nowhere near proven.
GitHub likely gave that option to satisfy user's lawyers who might have a higher threshold for "clean room" implementations or "no open source". Not as any kind of implication of copyright infringement.
Fair use works as an argument to the usability of Copilot.
Derivative works, per US Copyright law, are not infringement, either.
This article is from June last year.
Has anything happened since then on this?
Is anyone bringing legal actions against Copilot and Github?
If not why not? I would think that if there was a case against it, that there are open source affiliated entities out there that would have a both the legal and monetary resources to go to court over this.
When I look at the discussion here, it looks like most of the arguments are based on common sense or morality. While that is nice, I would have loved to hear a perspective on this based on law instead. Is there someone here with a more legal perspective/background that could comment?
I think copilot is pretty clearly copyright violation and in violation of licenses of "public" code. People uploading code to github are bound to the licenses just the same as anyone, unless you're the legitimate owner of all of the copyright in a codebase, you can't give change the license provisions by accepting a ToS.
I don't think it's really that murky, these models contain and have been shown to reproduce copyrighted code with the right prompting, it's not a grey area it's just obfuscated theft.
GitHub appears to be relying on a fair-use defense for Copilot. So it doesn't matter what license you slap on the code; if they're correct that their use constitutes fair use, as long as they can see your code, they can use it.
Ah, right so that's only true if you believe that all output from Github Copilot infringes on copyright. Most responses from LLM's don't include copyrighted information, or at least not substantial portions of such.
You're mistaking the end-user's copyright infringement with Copilot's alleged infringement.
Copilot is fair use and transformative -- that is unless there is an open source Copilot that Copilot is training on, only then would it be competing and it's easy for GitHub or OpenAI to exclude those repos of copilot alternatives from the training set.
I wasn't staking a position on whether Copilot is fair use, just pointing out that fair use doesn't care about license.
That said, copilot itself is not a replacement for your open source project that it was trained on. The code it generates may or may not be, but that's probably not Github's problem as far as copyright law is concerned.
If GitHub could guarantee that the code Copilot had ingested was only made with OSS licenses, then I don't see what the problem is.
But as far as I understand, GitHub trained Copilot on any public repository on GitHub, meaning even if it doesn't have a license specified (so the user publishing it still has the copyright to it), then I don't see how it can be OK.
Even if it is true that the ToS that noone reads allows GitHub to blatantly violate the license of your code that anyone could choose to upload to GitHub without the copyright holder's consent, people are going to do it anyway en masse, and that code is going to get eaten by Copilot. Even if copyright holders are constantly playing the game of reporting public repos to GitHub to remove it's not going to be enough.
>this "copilot is stealing" stuff won't even make it to court
IANAL but I am heavily skeptical of your confidence here.
You really seem to be ignoring the core issue by focusing on SO though. Everything on SO is fair game, but code on GitHub is under a variety of licenses, and when Copilot regurgitates it, no matter how complex and inscrutable the process is that leads it to do so, it may be causing the user of Copilot to misuse that code because it doesn't even give them the opportunity to know where it came from or what license it was released to the public under.
GitHub clearly stated they only used publicly available repos in this project. However, as many people are rightfully pointing out, those projects might still be either closed source or copylefted, and if Copilot regurgitates chunks of those projects, people who use it may be subject to infringement lawsuits in the future.
This is what some folks are trying to get the courts to decide regarding GitHub Copilot's skirting of OSS licenses https://githubcopilotlitigation.com.
Nobody seems to consider the fact that non-US law is also applicable to this issue. GitHUB ToS are subject to US/California law, but that does not mean that GitHub must not respect copyright in other jurisdictions where the service is provided. Many common open source licenses do not have a governing law clause (i.e. BSD and MIT). Since one of the primary defences for Copilot is fair use, and this concept does not exist in EU it would seem that Copilot is even more legally iffy in other jurisdictions.
The (insightful) point is that if the copyright holder is the one who uploads something to GitHub, that person has also agreed to the ToS. That was something I hadn't considered before reading the article.
That line of argument might defang any claims I might have against Copilot, as I have personally uploaded much of my public open-source code to GitHub.
> The larger issue is that anyone using Github is donating their work for re-use without attribution through Copilot.
I think this can only be valid of Github's terms of use clearly specify that or the chosen license allows it.
I actually see copilot benefiting GPL projects: suppose a programmer uses copilot to develop a proprietary software and copilot regurgitates GPL'ed code: now the proprietary software is a derivative work and must be GPL'ed too.
Because it exposes their direct hypocrisy in this, its fair use for OSS but not for us.
Questions here are very important, and its no surprise GitHub avoided answering anything about CoPilot's legality:
https://sfconservancy.org/GiveUpGitHub/
reply