This is a huge and looming legal problem. I wonder if what should be a big uproar about it is muted by the widespread acceptance/approval of github and related products, in which case its a nice example of how monopolies damage communities.
I think it won't become a legal problem until Copilot steals code from a leaked repository (i.e. the Windows XP source code) and that code gets reused in public.
Only then will we see an answer to the question "is making an AI write your stolen code a viable excuse".
I very much approve of the idea of Copilot as long as the copied code is annotated with the right license. I understand this is a difficult challenge but just because this is difficult doesn't mean such a requirement should become optional; rather, it should encourage companies to fix their questionable IP problems before releasing these products into the wild, especially if they do so in exchange for payment.
> I very much approve of the idea of Copilot as long as the copied code is annotated with the right license.
I agree, but I somehow doubt that will ever happen. Partly because MS is motivated to muddy the waters and shift norms towards allowing more of this kind of license-defying copying (because they make money from a product that does just that), and partly because the market for the most part doesn't think clearly about these issues. Many commenters here seem really fuzzy on the fact that nearly all code is, with or without an explicit statement as such, copyrighted (thanks, Berne convention), that that copyright (with or without documentation) is owned by someone, and that it is licensing which grants use of copyrighted work under specific circumstances. So as you say, the real problem is losing information about the license.
I'm grateful that the author of some LGPL'd code has triggered this discussion, since its a more consequential license w.r.t. code reuse.
reply