Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

> I think the biggest problem copilot will have in practice gaining traction is that verifying correctness isn’t any faster than writing the code yourself in many cases.

Humorously, this is a similar problem to the one autonomous driving has. Being alert when something goes wrong randomly is more difficult than being alert all of the time.



view as:

However in the real world people don't always write bugless code and aren't always alert when driving. Therefore these AI assistants can still have a net positive result as long as they are better than the average performance of a human. Of course three quarters of us probably believe that "I'm not an average programmer so Copilot would only make me worse."

Personally I think the more interesting angle is the trolley problem this creates. People will die in self-driving car accidents and bugs will exist in AI generated code. Those people and bugs are different than the people who will die in human caused accidents and the bugs in human written code. If the number and severity of the results are lessened by the computer, are we willing to forgive the damage directly caused by the AI that falls short of perfection?


I’m a very mediocre developer and if Copilot is any better than me at writing code, then I will have a hard time understanding whatever Copilot throws at me. I cannot just save, commit and push whatever Copilot suggests... so it’s faster if I write the code myself than to review Copilot’s code.

> I cannot just save, commit and push whatever Copilot suggests

I don't think that is the goal just like the goal of the current generation of self-driving cars isn't for you to be able to take a nap in the driver's seat.

Imagine you need some code that would have traditionally taken you and hour to write. I believe the goal of Copilot is to generate the code for you as a starting point. Maybe you don't understand that code immediately and it takes you 20 minutes to figure out what is going on. Then you spend another 20 minutes tweaking it for your exact purpose. If that results in code of similar quality to what you would have written alone, then Copilot makes you more efficient by saving you 20 minutes.


> I don't think that is the goal just like the goal of the current generation of self-driving cars isn't for you to be able to take a nap in the driver's seat.

I think the issue is that the MVP from a customer perspective is, effectively, being able to take a nap in the driver's seat. From a research perspective there are obviously intermediate milestones, but that doesn't make it fit for what people would want to use it for. Same goes for Copilot.


>I think the issue is that the MVP from a customer perspective is, effectively, being able to take a nap in the driver's seat.

Maybe that is a requirement for some users, but it isn't a universal one. Plenty of people see a benefit in assistive technology that isn't complete such as adaptive cruise control or boilerplate/scaffolding dev tools.

It also raises the ethical question of whether these creators are responsible for the misuse of their products. Is it enough for them to say "This is how this product should be used. You are on your own if you use it outside these settings."? Holding developers responsible for the misuse of their software could create an actual slippery slope. Where is the line drawn? Do we start punishing people who create encryption algorithms because someone used the encryption to hide evidence of a crime?


> It also raises the ethical question of whether these creators are responsible for the misuse of their products. Is it enough for them to say "This is how this product should be used. You are on your own if you use it outside these settings."?

I don't think you have to answer the ethical question to address the level of readiness that Copilot or self-driving cars are at. It definitely raises the question, but you don't have to answer it to talk about suitability for use cases.

As you say, it might address the requirements of some specific people. My argument is that Copilot is not good enough yet for the bulk of imagined use cases, whether or not you call that MVP, and I think the post makes a good argument about why.


I haven't had a chance to try it yet, but I'm skeptical of the time savings claim of copilot in its current form. At least working on a large code base, the things that take time are:

1) Understanding the data model and logic of the code that interacts with the component I'm working on

2) Refactoring existing code to accommodate my change gracefully

3) Writing and fixing tests

4) Working through the code review process

For a major new piece of functionality, add

5) Put together a design document and review it with relevant stakeholders

The part that is fast is actually writing the code, as once I've done steps 1 and 2 (and sometimes 5) writing the new code itself is near trivial. I don't see how copilot could possibly help me in a meaningful way on these kinds of tasks.

The work that seems most amenable to copilot help is things like utility functions for transforming data/calculating things from it, as in the "Easter" example from the article. But here I would rather use a well-tested library, or if one doesn't exist (or I can't use it), write well documented code that I understand thoroughly.

Put another way, the work that copilot seems most adept at is "junior developer" work performed by people operating at a junior level. But if they delegate "figuring things out" to copilot, they're just going to spend way more time in code review. Or worse, they're not going to spend that time, and will learn nothing/stagnate in their professional progression.

Ever since the advent of satellite nav I've become terrible at learning my way around cities. I'm okay with the loss, since I can generally rely on having nav when I need it, and navigating cities isn't one of my core responsibilities. Copilot is not reliable (it won't answer your question every time), and it automates something that is your actual job. A junior dev might be better served by spending the extra 20 minutes muddling through and building their skillset.


Faster and more fun to write it yourself.

I don't think that's a reasonable assumption - that if you're a mediocre developer you won't understand code written by a better developer.

Better code should largely be easier to understand.


So, you can at least make a theoretical argument for why self-driving cars can do a better job than humans: they are always alert and paying attention, and the set of things they're trying to accomplish are concrete, can reasonably be presumed a priori and baked into themodel, and are reasonably well specified so that we hopefully don't need hard AI to be successful.

By contrast, Copilot doesn't necessarily have any idea what you're trying to do. So it can, to an approximation, pattern match on what you've already written, and spit out valid code that is "inspired" by things it's seen in the past. But it doesn't actually know what you're trying to do. It doesn't know what your acceptance criteria are, or what invariants you're trying to maintain, or anything like that. And, at least in the places I've worked, most the interesting bugs (by which I mean, the ones that managed to cause trouble in production) happen when the programmer writing the code didn't have a firm idea of what they were trying to do. So, that's what worries me - I would fear that the spots where Copilot can't even theoretically be expected to do a good job happens to be exactly the kind of things for which people would tend to rely on it the most.

Maybe I'm being overly pessimistic? But that's kind of my job - I work in an area where "move fast and break things" is pretty antithetical. But it would still be a lot more compelling to me if I could see a paper demonstrating that a team using Copilot has fewer production defects than a team that's doing exactly the same work but without Copilot. Or alternatively, if it were repackaged as something that's a bit like a smarter version of IDE refactorings. "Hey, it looks like you're about to spit out a big old mess of boilerplate. Let us get that for you." Or, "Hey, some functions you called can fail, how about I go ahead and suggest a catch block so you don't forget to write one?" Basically, give me something that's a bit more smart cruise control and a bit less Autopilot.


My cynical opinion is that Copilot will be most useful for people who write junk code. They will be able to write more junk code faster.

> However in the real world people don't always write bugless code and aren't always alert when driving. Therefore these AI assistants can still have a net positive result as long as they are better than the average performance of a human.

This is an extremely good analogy -- in both situations, the human will become lazy and stop paying attention (regardless of whether they're supposed to keep their hands on the wheel, literally or metaphorically), and it will be possible to have a net result worse than either human or AI acting alone.


Are we also willing to just start accepting lower quality code (on average) because we have AI guiding us towards it.

What's the point of striving to write better, more correct code, being a safer driver, if all we ever do is rely on the status quo to train models to be average?


Legal | privacy