Hacker Read

jeffbee | karma 21041 | avg karma 2.25 · 2021-12-01 22:36:20

I don't know ... the video at least is straight up bullshit, right? The terminal at the bottom of the editor session shows that the user has already run "copilot_solution_puzzle_two.py" with the desired answer before he types anything.

callmekit | karma 490 | avg karma 4.15 · 2021-12-01 22:40:48

That's already mentioned in the blog post: the author first wrote a solution in a normal way, and only then decided to try copilot.

FreddieV4 | karma 62 | avg karma 20.67 · 2021-12-01 23:08:14

Hey, thanks for your comment. I actually tried running the code just once, which printed the correct answer, but I forgot to record it. So I deleted all of the code and tried typing _again_ from scratch to record it, and ran the program _again_, and it yielded the same result (the second run is what's shown in the video)

This is also something I'm curious about regarding Copilot. I mentioned that "GitHub Copilot has a list of possible solutions to code completions, so I wonder if it’s just luck that the suggested solution was the correct one".

I'm going to test further to see if it just always produces the same code completion, or there's some bit of randomness to the completion.

reply

YeGoblynQueenne | karma 22041 | avg karma 2.5 · 2021-12-02 04:53:22

>> This is also something I'm curious about regarding Copilot. I mentioned that "GitHub Copilot has a list of possible solutions to code completions, so I wonder if it’s just luck that the suggested solution was the correct one".

The Codex model on which Copilot is based had about 30% accuracy on the first solution to a coding problem, but 70% when it was allowed to generate 100 solutions and choose the one that passed unit tests. On the other hand, when the best-of-100 solution was chosen according to the probability assigned to it by the model it scored 45% [1]. So it's kiind of luck.

Basically, Copilot has no way to tell whether it's giving you a right or wrong answer, other than to select the answer with the highest probability according to its training set which usually means the most common answer in its training set. So the probability that the answer you get is the "right" answer depends on the probability that the right anwser is the most common answer to the problem you give it. If that makes sense?

__________

[1] https://arxiv.org/abs/2107.03374

See Figure 1.

reply

tgv | karma 8955 | avg karma 2.31 · 2021-12-02 01:47:27

I'm doubtful, too. How would copilot know what to generate for the second function? How can get_depth_input be sufficient information to generate a function that reads a text file "input.txt" line by line and cast each line to an int? If copilot truly does that, then it's heavily geared towards toy problems.

jacobmischka | karma 893 | avg karma 3.08 · 2021-12-02 02:44:18

Because he literally already told it what he wants it to provide in his existing solution, which you can see in the tree pane on the left. This experiment means nothing.

callmekit | karma 490 | avg karma 4.15 · 2021-12-02 03:31:58

Does Copilot use other files open in the editor? Are they becoming a part of the prompt?

dragonwriter | karma 118260 | avg karma 2.17 · 2021-12-02 03:36:48

> Does Copilot use other files open in the editor?

While I am not 100% sure of the sources, my use of Copilot makes me pretty sure it uses other open files in the editor, other files in the current project folder (whether or not open in the editor), and to suspect it may use the past history of the current file (at least in the same edit session).

reply

YeGoblynQueenne | karma 22041 | avg karma 2.5 · 2021-12-02 05:11:11

That sounds like too much input. Remember that Copilot is based on GPT-3, so its input size is limited to 2048 tokens.

I think it's more simple to assume that "get_*_input" is a common name for a function that reads input from a stream and so that this kind of string is common in Copilot's training data. Again, remember: GPT-3. That's a large language model trained on a copy of the entire internet (the CommonCrawl dataset) and then fine-tuned on all of github. Given the abundance of examples of code on the internet, plus github, most short progams that anyone is likely to write in a popular language like Python are already in there somewhere, in some form.

The form is an interesting question which is hard to answer because we can't easily look inside Copilot's model (and it's a vast model to boot). The results are surprising perhaps, although the way Copilot works reminds of program schemas (or "schemata" if you prefer). That's a common technique in program synthesis where a program template is used to generate programs with different varaible or function names etc. So my best guess is that Copilot's model is like a very big database of program schemas. That, as an aside.

Anyway I don't think it has to peek at other open files etc. Most of the time that would not be very useful to it.

reply

jacobmischka | karma 893 | avg karma 3.08 · 2021-12-02 06:16:08

You're right, I was wrong.

> GitHub Copilot uses the current file as context when making its suggestions. It does not yet use other files in your project as inputs for synthesis. [1]

[1]: https://copilot.github.com/

reply

YeGoblynQueenne | karma 22041 | avg karma 2.5 · 2021-12-02 09:04:25

Well I just guessed. So... we arrived at the correct answer through our interaction? :)