They also have the ability to install malware on Windows and use everyone's source code for training, but choose not to, because private code is private. Their own code isn't an exception. Microsoft code in Github repos is used for training, just like the rest.
Because... it's private code. Can the company be 100% certain there are no passwords, DB keys, other company secrets in it? Can they be certain there's no personal employee data? Internal product names? A hundred other similar concerns with proprietary IP? Regardless of how the LLM transforms it the individual bits of data are still there.
On the other hand if the repo is already public on Github then exposing it via an LLM is not introducing any new security risk.
reply