Yeah that's a mess, but that's way too much legal baggage for me, an otherwise innocent end user, to want to take on. Especially when I personally tend to try and monetize a lot of my work.
I understand there's no way for the model to know, but it's really on Microsoft then to ensure no private, or poorly licensed or proprietary code is included in the training set. That sounds like a very tall order, but I think they're going to have to otherwise they're eventually going to run into legal problems with someone who has enough money to make it hurt for them.
Open source code is open source when it license is obeyed only. When license is not obeyed, e.g. copyright notice is no reproduced, it should be treated as private code, except for dual-licensed code.
Of course the model can know if the code is repeated in multiple repositories with different licences. The people who maintain copilot simply don't care to make it do so.
I understand there's no way for the model to know, but it's really on Microsoft then to ensure no private, or poorly licensed or proprietary code is included in the training set. That sounds like a very tall order, but I think they're going to have to otherwise they're eventually going to run into legal problems with someone who has enough money to make it hurt for them.
reply