Where's the part of the Copilot EULA that indemifies users against copyright infringement for the generated code?
If the model was trained entirely using code that Microsoft has copyright over (for example: the MS Windows codebase, the MS Office codebase, etc.) then they could offer legal assurances that they have usage rights to the generated code as derivative works.
Without such assurances, how do you know that the generated code is not subject to copyright and what license the generated code is under?
Are you comfortable risking your company's IP by unknowingly using AGPLv3-licensed code [1] that Copilot "generated" in your company's products?
Ultimately I think MS is right betting on the fact that this will not matter in the long run. All AI tools are trained on copyrighted data. The tools are too useful to cripple with restrictive laws written for a previous era.
>> All AI tools are trained on copyrighted data. The tools are too useful to cripple with restrictive laws written for a previous era.
Are you sure that Disney and Games Workshop will not mind if make your own Mickey Mouse / Warhammer 40K cross-over movie for profit? (I would watch it!)
Those are protected by trademark law which will still be strong in a post AI world. I'm sure those companies will be making massive use of AI to speed up content production.
You previously stated "The tools are too useful to cripple with restrictive laws written for a previous era." and also "Those are protected by trademark law which will still be strong in a post AI world."
So do you believe in legal protections for trademarked and copyrighted works or not?
If an AI assistant generates content that includes copyrighted or trademarked elements is the content "too useful to be crippled by restrictive laws" or "protected by laws which will still be strong" ?
The pirate movement might have been a bit early for their time, but if AI happen to be the technology advancement needed for society to abolish copyright law then lets get it done.
With everything from scihub getting blocked and people getting subscription fatigue, it is an excellent time for the restrictive laws to be replaced.
> The pirate movement might have been a bit early for their time, but if AI happen to be the technology advancement needed for society to abolish copyright law then lets get it done.
No argument there; that'd be a huge victory.
But until that happens, tools for "smart completions" need to provide appropriate licensing and attribution metadata.
Please by all means have those laws repealed. Until then, a license violation remains a license violation no matter how much AI you run the code through.
"The GNU Affero General Public License is a modified version of the ordinary GNU GPL version 3. It has one added requirement: if you run a modified program on a server and let other users communicate with it there, your server must also allow them to download the source code corresponding to the modified version running there."
So your company is okay providing all their source code to users?
I know that some companies do this, but most do not.
If you use AGPLv3-licensed code in your codebase, you are agreeing to the terms of the license.
Practically all corporate legal teams for companies that creates software will strictly prohibit the use of AGPLv3-licensed software, if not all GPL-based licenses.
Using AGPL code doesn't mean you agree to the terms. It means if you don't agree to and obey the terms, you don't have a license, which is copyright infringement.
Being taken to court for infringement of free software's copyright is rare. On top of that the copied code only being a snippet from copilot makes it even less likely to happen. The snippet alone may not be considered copyrightable.
That's not how copyright infringement works. Great, you've stopped committing new infringements, but there's still a legal case over the previous infringements, for which you still need to provide appropriate remedy.
If the remedy for copyright infringement were just "oh, we got caught, guess we'll stop now", that would provide substantial incentive for people to violate licenses as long as they hoped they wouldn't get caught. The remedy for such violations needs to be substantial enough that it's not profitable to temporarily get away with.
The company would have to pay a fee to the copyright owner plus some extra.
I am jaded from practice around GDPR where the thinking of companies goes along the line: if we are caught, we will pay extra, but now we make big cash. And who knows, maybe we won't get caught.
"GitHub’s defense obligations do not apply if (i) the claim is based on Code that differs from a Suggestion provided by GitHub Copilot, (ii) you fail to follow reasonable software development review practices designed to prevent the intentional or inadvertent use of Code in a way that may violate the intellectual property or other rights of a third party, or (iii) you have not enabled all filtering features available in GitHub Copilot."
Do they define what constitutes "reasonable software development review practices designed to prevent the intentional or inadvertent use of Code in a way that may violate the intellectual property or other rights of a third party" ?
If the model was trained entirely using code that Microsoft has copyright over (for example: the MS Windows codebase, the MS Office codebase, etc.) then they could offer legal assurances that they have usage rights to the generated code as derivative works.
Without such assurances, how do you know that the generated code is not subject to copyright and what license the generated code is under?
Are you comfortable risking your company's IP by unknowingly using AGPLv3-licensed code [1] that Copilot "generated" in your company's products?
[1] https://en.wikipedia.org/wiki/GNU_Affero_General_Public_Lice...
reply