Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

From the official FAQ [0]:

> Other than the filter, what other measures can I take to assess code suggested by GitHub Copilot?

> You should take the same precautions as you would with any code you write that uses material you did not independently originate. These include rigorous testing, IP scanning [emphasis mine], and checking for security vulnerabilities. You should make sure your IDE or editor does not automatically compile or run generated code before you review it.

I think lots of companies do run tools such as BlackDuck and others to scan their entire code base and ensure (or at least have some ass-covering) that there is no accidental copyright infringement.

[0] https://github.com/features/copilot#other-than-the-filter-wh...



view as:

How much of what you save by using Copilot will then be spent on BlackDuck licenses?

Capex vs opex, huge difference

While the cost to programmers' sanity of running things like BD is immeasurable in my estimation, if you are already doing it, doing it for Copilot code shouldn't add any extra cost, unless Copilot is actually constantly spewing copyrighted code.

> While the cost to programmers' sanity of running things like BD is immeasurable in my estimation

Can you clarify? In my experience, source scan is just another job in one's build pipeline. And I've only seen it fail when it does, in fact, detect a new component (or a license change in the existing component) - because at that point you have to do the legal dance for third-party notices etc. But the latter part something you have to do either way, tools or no tools.


Source scan is indeed not a problem. Scanning all the binary blobs is where things go wrong, on two aspects.

For 1, there are quite a few false positives, especially if you use commercial 3rd parties as well. For example, I had a UI component recognized as some obscure academic micro kernel!? Investigating, we found that happened because that micro kernel project was using the same commercial UI component somewhere (probably under some academic license), and there repo was just where BD had seen this JS code before.

For a second, and much more common and annoying one, at least in BD in my company, you have to add explanations to each individual identified 3rd party package that uses something like GPL to affirm that it is being used in a way that complies with a license. If you're doing something like distributing a Linux VM, that means hundreds of packages that are part of the distribution. This work has to be done manually, which means entering the same copy/paste text in hundreds of places in the atrociosly slow BD UI.


Legal | privacy