Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

What alternative are you suggesting?

Things turned out pretty great economy-wise for people in the UK. So that's a poor example even if Luddites didn't hate technology. Not working on the technology wouldn't have done the world any favours (nor the millions of people who wore the more affordable clothes it produced).

I personally think it'd be rewarding to make developers lives easier, essentially just saving the countless hours we spend googling + copy/pasting Stackoverflow answers.

Co-pilot is merely just one project in this technological development, even if a mega-corp like Microsoft doesn't do it ML is here to stay.

If you're concerned that software developers job security is at all at risk from co-pilot than you greatly misunderstand how software engineering works.

Auto-completing a few functions you'd copy/paste otherwise (or rewrite for the hundredth time) is a small part of building a piece of software. If they struggle with self-driving cars, I think you'll be alright.

At the end-of-the-day there's a big incentive for Github et al to solve this problem, a class action lawsuit is always an overhanging threat. Even if co-pilot doesn't make sense as a business and these pushback shut it down I doubt it will go away.

I'm personally confident the industry will eventually figure out the licensing issues. The industry will develop better automated detection systems and if it requires more explicit flagging, no-one is better positioned to apply that technologically than Github.



sort by: page size:

CoPilot and IP as it exists are incompatible. They cannot coexist and still make sense. If all it takes to get around licensing is piping it through ML, that sets a legal precedent that basically legitimizes the practice of "math washing", which is basically dodging personal accountability because the machine did it, not me.

Further, Copilot enabled exactly the opposite of what good software engineering is about. We should understand the underlying consequences of the code we write. This includes libraries, dependency graphs, licensing incurred obligations, etc.

Also, Microsoft quite literally bought the most popular version control as a service company, then leveraged it to create a Machine Learned code generation framework.

They didn't ask for opt in. They didn't do due diligence, they didn't let anyone know ahead of time. They didn't ask anyone, they just did it.

You may look at my last paragraph and think, "Yeah, so what? Welcome to innovation, move fast and break things!"

You may not have noticed if you only pay attention to the tech world, but you live with a bunch of other non-tech people who have to regularly follow way more rules than tech companies have been held accountable for, and a lot of them are realizing that the relative competitiveness of tech is probably coming from their ability to skirt regulations put in place for good reasons.

While yes, sometimes society turns a blind eye and selectively enforces laws/regulations, etc..., It is generally done most frequently when socio-political agents are confident that harsh, resource-intensive enforcement really wouldn't produce as much realistic value in terms of applying that effort. The last decade has seen a lot of non-tech folks starting to become more aware of the reality of how tech isn't out or intending to act responsibly. They're out to make money, and position temselves into centralized positions of power and exaggerated influence.

Copilot is one more example of tech people being so concerned with whether they could, that once again, no one sat down and wondered if they should.

That is why people are angry.


My level of engagement with this: I was accepted into the technical preview of CoPilot, and while I did generate a few functions and messed around a bit, I don't do enough actual programming these days (because reasons) to have made good use of it. I won't be using it as a commercial product, I'm pretty sure, and I seem to have until August to use it for free, if I'm gonna. Anyways...

Remember Bill Gates' open letter to hobbyists in 1976, in which he blamed pirates for ruining everything? This discussion is much more interesting and subtle than the old debate about piracy's place (and it definitely has a place) in user communities, but I find it humorous that Gates's behemoth, all these years later, seems to have circled around to doing the same thing.

As one who started out by going to swap meets with my Commie 64 and completely disregarding the hand-wringing, moustache-twirling frets of the people RMS (and I) would later refer to as Hoarders, I am pretty sure that I land in the Don't Care category of this debate. Specifically, I think the most compelling point made in the article is when it points out that the tools which were used to create CoPilot are available for free to all, so anyone who is willing to put in the effort of learning how to train models is free to recreate CoPilot for themselves, and share it with anyone.

We all have copypasted code from github or SO at some point without thinking about the license, I am pretty sure, in many cases while at work or otherwise engaged in commercial software activities. CoPilot cannot write entire programs (yet), all it can really do at this point is automate the searching and CTRL-C+V part for snippets and functions, and that is a net good for anyone who writes software.

If there is going to be a lawsuit, I would hope that it hinges strictly on whether github has the right to charge money for it, because essentially, what they are doing by increasing the speed at which people can work is lowering the number of coders required to finish a project on a given timeline. This is both a boon to anyone who is working on something they intend to release FOSS, and simultaneously a device by which greedy Hoarders who want to pay as few people as possible while profiting from the work of those they do pay can increase their profits immensely while creating fewer good jobs for humans.

To put it more simply, when Hoarders use this tech, it is a transfer of wealth out of the pockets of working programmers, into the pockets of shareholders in whatever corporation is reselling this technology. There is good reason, in other words, for working programmers to dislike this, and for FOSS volunteers to like it.

The lawsuit I would like to see happen, if they must, is one which simply seeks to prevent anyone from charging for this, as that would at least keep the investor class from further draining the economy of people who actually work.

Given that the tools for training models are freely available, what I would prefer is if someone with the time implements some form of free version of the same thing. The data they trained the model on for Copilot is freely available to anyone with an internet connection, as are the ML tools. It's extremely clear to everyone that the code copypasta'd by the model does not belong to Microsoft/Github, so if they can sell it, nobody can argue that a free program can't automatically search and copy it for you for free.

There are some that this solution, which I am pretty sure will come about, would not satisfy, and those are the people who can rightly be said to be against innovation and new technologies. It hinges on whether money is changing hands, and whose hands the money is passing into.

Back in those Commie 64 days, we all pirated freely with zero consequence, with one exception that I can recall: one dude in my city started selling pirated software at $5 per floppy. As it happens, he was a friend of my uncle's and I knew him personally. He was a poor man with very little going for him in life, suffice to say, but nonetheless, it only took a couple of months until the police were at his door, and they seized his 64 and all his software.

I disagreed with his actions but I never hated the guy, he was too sad and pathetic to hate if you saw the life he actually lived, but many, many people did, including all the local pirates, because we were firmly of the mindset that software was meant to be shared freely. We may have been wrong about that, but everyone, pirates and non-pirates alike, agreed that the dude taking money was 100% wrong.

I think the same dynamic is taking place here. You will never convince me that copying code from github or SO is wrong, but I agree that CoPilot should not be a commercial thing.


I think people should do the opposite and move their code to GitHub if it helps Copilot. It's the single most important invention of my lifetime. Its existence trumps any ideological or legal hurdles.

The problem here is that copilot explots a loophole that allows it to produce derivative works without license. Copilot is not sophisticated enough to structure source code generally- it is overtrained. What is an overtrained neural network but memcpy?

the problem isn't even that this technology will eventually replace programmers: the problem is that it produces parts of the training set VERBATIM, sans copyright.

No, I am pretty optimistic that we will quickly come to a solution when we start using this to void all microsoft/github copyright.


"It doesn't affect my job yet because I'm not an artist, so I don't care".

Yet, we see managers at software companies off-shoring for cheap contractor coding sweatshops in the South East of the world and on the site, which HNers here see that as a 'problem' and is hated on.

Well guess what? Copilot will make it worse and will have cheap software shops generating code / project templates and getting more for less rather than employing an expensive full time typical western software engineer. Open source will just accelerate the drive down against closed source alternatives as well.

The start of the race to the bottom, Then it would be a problem for software engineers, software companies, etc. Open source alternatives is already eating itself and closed-source alternative and tools like Copilot used by contractors will just accelerate this much quicker.

That would then be your problem.


I'm kind of wondering if this controversy might not end up being a storm in a teacup.

From what I've seen copilot really lowers the barrier to writing buggy code. If indeed it does turn out to be a tool that lends itself to machine gunning rather than shooting yourself in the foot it almost doesnt matter who owns what IP.

The relentless attempts at developer commodification will, of course, continue, but I can already sense this one ending up like the developer outsourcing craze of the mid-2000s that the Economist also got a little too excited about.


Adding to this:

I run product security for a large enterprise, and I've already gotten the ball rolling on prohibiting copilot for all the reasons above.

It's too big a risk. I'd be shocked if GitHub could remedy the negative impressions minted in the last day or so. Even with other compensating controls around open source management, this flies right under the radar with a c130's worth of adverse consequences.


it doesn't at all challenge any assumption we have about the current world, unless you've been living with the assumption that you can't automate the process of producing code snippets.

Copilot is no different than stackoverflow with the exception that your selection mechanism is done by an artifical neural net rather than a bunch of people with up and downvotes in front of their screen. The reason people are uncomfortable here isn't some vague PR speech notion about the future, it's that copilot appears to be quite literally ignoring software licenses. Can we not devolve into this Silicon Valley corporate speak of rebranding a company ignoring intellectual property as innovation?


Either these things stop or software licensing is meaningless. Pick one.

People somehow arguing that both things can be untrue are delusional imo. If copilot is ok, then obfuscating code in order to get around a license is ok.

Copilot can produce agpl code verbatim, and per the license copilot should also be agpl, but that’s not going to happen practically, so copilot will probably be shut down.

This is to say nothing of commercial code.

Best way to think of it is like - if copilot was a group of people manually writing and sending code would it be ok? No.

Why should big tech be able to effectively reproduce the work of the little guy without following the licenses they set?


I'm sure you see you've been downvoted. But I want to say I agree with you on the millions / billions bit.

While I understand the rub about licenses, the fact is the vast majority of code is not all that original or unique. Some fringe amount is, and those edge cases are worth discussing.

But the rest? Likely not all in all all that special. Yes, we get paid good money to do it. But is that a function of what it takes to do the work, or the demand for the skill (relative to supply of that skill)?

Frankly, I think some ppl just plain ol' fear Copilot. And either don't want ro admit it, or they have buried that fear. I'm not advocating ignoring the law / licenses. But putting a licence and lipstick on what is an everyday pig doesn't make that pig a unicorn. Does it?


You want to defend the big company? Why? Copilot is obviously breaking terms of licenses. At the very least it should only be trained on MIT and return MIT code

I find the proposed solutions in the article quite sensible and fair:

> Allow GitHub users and repositories to opt-out of being incorporated into the model. Better, allow them to opt-in. Do not tie this flag into unrelated projects like Software Heritage and the Internet Archive.

> Track the software licenses which are incorporated into the model and inform users of their obligations with respect to those licenses.

> Remove copyleft code from the model entirely, unless you want to make the model and its support code free software as well.

About the last point: Many of the larger corporations avoid copyleft code entirely or only use it in ways that avoids the viral nature of them. So I have to assume that Copilot would also be avoided for the same reasons by the same corporations until they can opt out.

Free software is already being exploited and violated in various ways by various actors and Copilot seems to be a tool that makes this even easier. I think it will be interesting to see where this issue goes.


As hypetheticals:

- "you were hired to write code for us, not to use an autocomplete service that makes us liable for both copyright and patent lawsuits, I hope you like getting fired."

- "as per this project's license, we can only take code on board that you contributed under our license, but an audit shows that your PR/MRs contain tons of GPL/MIT/Whatever licensed code instead. We're going to have to back all of that out, and we're going to revoke your contributor status"

- etc.

If you don't know where the code in your autocomplete comes from (and copilot can autocomplete large swathes of code) then literally anything that comes out of "you didn't write this code" may apply. From fraud (depending on what contract you signed) to trademark infringement, to license violations, to even just simply misrepresenting your skills to an employer. As with all things, it's a sliding scale, but just because the majority of incidents will be on the bening part of the spectrum doesn't mean the litigating part doesn't exist, and that's what your legal department plans for.

Work for a big company? Good bet you're not allowed to use copilot. And depending on the company, not even "for your personal projects" because you might accidentally read someone else's license encumbered code that you would not have come up with yourself and may now open your employer up to "you stole our ideas instead of properly crediting/paying for licenses".

Copilot is a legal nightmare.


Nice straw-men you're collecting there.

First one: Uses of attribution and copyleft licenses are just ego-boosting, instead of legitimate protection of authorship against corporate piracy.

Second: Criticism of said corporate exploitation of community work is the actual entitled behaviour. Oh, it's also abusive.

Third strawmen: that people who oppose CoPilot in its current form just want to defend copyright around boilerplate stack-overflowish type code.

All false.

I can only assume... You're either too young and inexperienced to remember the early days of the copyleft, free software, and open source movements and why these licenses exist (and still need to exist)... or your values are so backwards that you just think it's Ok to harvest other people's hard work for your own (or your employer's) own profit.

To be clear: There is no heuristic at work in something like CoPilot that can distinguish between boilerplate code and genuine innovation. It has been shown multiple times to just freely copy and paste novel, copyrighted code; without attribution or conforming to license restrictions. That is unacceptable and deserving of legal countermeasures.

I would have no problem with CoPilot copying only the code of people who have opened their code for that kind of use. But that's not what it does.

Notable that Microsoft, its owner, is not training CoPilot on its own massive corpus of code. Just other people's code.


And I don't know what you're even getting at with the "assembling working programs" vs "navigating license minutae" stuff. Copilot helps the former, and you're apparently trying to argue that it should be banned because some people are feeling angsty about the latter?

Nobody wants to completely ban AI or training. You can have progress in this area and still respect people's intellectual property. Framing this as destroying progress is silly.

Why can't they train on the code they own, such as Windows sources for example?

Or even better, why can't they release CoPilot itself under an open source license that is compatible with the licenses of code they would like to train on?

Also, I don't think anyone cares about the monetary aspects. The idea behind the GPL style license is to make sure that code remains free, regardless of what or who uses it. Freedom in this context refers to the ability study the code, modify the code, and distribute any modifications. Without the GPL the code can be used in a proprietary product which strips those rights away from users of the product.

Copeleft uses copyright laws to attempt to guarantee freedom for users. This is the inverse of what normal copyright does, which is allow a single entity to sit on the ideas and not allow other's to benefit from them.

If we can just strip copyleft licenses from projects, we are giving up those guarantees that GPL code will remain free for all users.

The GPL is trying to do it's job here, not slow down progress. Progress would be everyone benefiting from the technology behind CoPilot, rather than just MicroSoft sitting on the project and selling it as a service.


Copilot is a product -- at least indirectly -- of Microsoft, a company who for about a decade made very public pronouncements about how they disagreed with the GPL (or copyleft generally), found it problematic, and tried actively to discourage its use.

Today's MS isn't really the same, and they've clearly made their peace with Linux. But it still happens that the GPL is in some fundamental ways at odds with commercial exploitation of open source code. So any corporate entity is going to struggle with it because at best it requires being very careful in distribution, or trying to negotiate or cut a deal with the licensee. At worst it can lead to legal problems and IP leakage on your own product.

So, not claiming any conspiracy. Or intent to violate intentionally. But it is in the convenient interests of companies like MS/OpenAI/GitHub to treat open source work as effectively public domain rather than under copyright, and to push the limits there.

The risk to an employer is of course the accidental introduction of such copylefted material into their code-base through copilot or similar tools.

I suspect two sources of disconnect with the broader community on hackernews that doesn't seem to see the issue here:

a) Much of the folks on this forum are working in the full-stack/web space where fundamentally novel, patented, or conceptually difficult algorithms and datastructures are rare. For them Copilot is an absolute blessing in helping to reduce the tedium of boilerplate. However in the embedded systems, operating systems, compiler, game engine dev, database internals etc. world there are other aspects at work. In certain contexts, Copilot has been shown to reproduce complicated or difficult code taken from copyrighted or copylefted (or maybe even patented sources) without attribution. And apparently now with some explicit obfuscation.

To put it another way: it's unlikely that Copilot's going to violate licenses with its assistance with turning your value/model objects from one structure to another, or writing a call into a SQL ORM. But it's quite possible that if I'm writing a DB join algorithm or some complicated math in a rendering engine or a compiler optimization phase that it could "crimp notes" from a source under restrictive license... because those things are absolutely in its learning set and the LLM doesn't "know" about the licensing behind them.

b) Either misunderstanding of, or lack of knowledge of, or outright hostility to... copylefted or attribution licenses which require special handling.


Good bye and good riddance. Even just the idea that GitHub should be allowed to train their proprietary AI on other people's work is insane. Much less distribute that AI in a paid package which lets you spit out other people's code verbatim. Anyone who supports open-source and the (ab)use of copyright law to create free works should be vehemently opposed to Copilot.

I agree - it's problematic enough that licensing information gets lost in the Copilot process, but as is we basically have developers contributing their time and expertise, for free, to the development of Microsoft's new paid proprietary product. Worse still, if Copilot is as revolutionary as some people make it out to be, those same developers are inadvertently helping Microsoft build a monopoly in a new market, with all the disastrous consequences that entails.
next

Legal | privacy