Hacker Read

Hacker Read top | best | new | newcomments | leaders | about | bookmarklet

login

		LocalPilot: Open-source GitHub Copilot on your MacBook (github.com) similar stories update story
		249 points by charlieirish \| karma 7246 \| avg karma 7.84 2023-10-19 04:40:31 \| hide \| past \| favorite \| 62 comments

view as:

ShamelessC | karma 2803 | avg karma 1.26 2023-10-19 05:29:45 | [–] similar comments

Queue fifty comments about how language on the repository is misleading, “does this download Microsoft weights?” and other friends.

rcarmo | karma 26588 | avg karma 5.05 2023-10-19 05:39:26 | [–] similar comments

Well, technically it is a MITM proxy for a locally hosted LLM, but people these days prefer simple, catchy names...

But it might be useful if, say, you have a local GPU-powered machine on your LAN. I just wish they weren't using the advanced settings in the CoPilot extension and were using, say, one of the many OpenAI-powered alternatives (like Genie) -- would feel less like a hack and more like an alternative.

notpushkin | karma 4321 | avg karma 3.73 2023-10-19 05:54:05 | [–] similar comments

Yeah. My main problem with this is that the Copilot extension is proprietary (or at least it was – maybe that has changed?) and you can't install it on VSCodium, for example.

rudasn | karma 822 | avg karma 1.63 2023-10-19 07:01:20 | [–] similar comments

Yup, still is.

marci | karma 362 | avg karma 1.48 2023-10-19 11:41:19 | [–] similar comments

And queue the somewhat relevant xkcd:

https://xkcd.com/1053/

imrehg | karma 1272 | avg karma 2.24 2023-10-19 05:52:10 | [–] similar comments

I guess similar to ollama (recently discussed: https://news.ycombinator.com/item?id=36802582) which also has support for code-focused models (see: https://ollama.ai/library).

I tried pretty much all of them with Continue in VSCode, and it's a bit hit and miss, but the main difference is the way the workflows work (Copilot is mostly line completion, Continue is mostly chat or patches). So the main value add here for me would be a more Copilot-like workflow (which seems to align better with the day-to-day experience I has so far).

fzzzy | karma 1315 | avg karma 1.53 2023-10-19 07:25:58 | [–] similar comments

This is interesting because I started using continue specifically because it doesn't just do tab completion. I have normal Github Copilot turned on again as well now after having it off for a while and the tab completion does seem to be better than it was a few months ago. Still, explicitly asking continue to write a diff has a higher 'this is correct' rate for me because I can describe what I want in detail.

Tab Completion seems fine for cases where I add one new line and the next two lines are just incredibly obvious. I am going to experiment with writing a little comment first to see if it primes the tab completion to do something non-obvious.

imrehg | karma 1272 | avg karma 2.24 2023-10-19 08:38:34 | [–] similar comments

Maybe it depends on the work people are doing. I am doing a lot of data engineering where I can be often tabbing through while adjusting bits that are off, or doing Python portion by portion, and depending on the complexity it does go from "finish this string I'm writing" to few lines worth of code.

I'm often surprised that how quickly the model discovers reasonable patterns (even across files as well, that's often necessary to be correct).

With the diffs I find that by the time I describe things in a way it has a chance of working, I might have just written the whole thing myself. Especially as the diffs are often need correctly. With tabs that correction is part of a fast feedback look, with diffs it's so far a slower loop and just more awkward.

Of course, with changes to workflows one or the other can shine and if there's an interface that's faster than typing the instructions out that might just supercharge things for the diff/chat type too.

yustee | karma 5 | avg karma 2.5 2023-10-19 08:03:28 | [–] similar comments

You should check-out https://github.com/smallcloudai/refact. It has both autocomplete and chat. It's in active development, with lots of new features coming soon (context search, fine-tuning for larger models, etc)

imrehg | karma 1272 | avg karma 2.24 2023-10-19 08:33:44 | [–] similar comments

It looks like that needs an NVidia GPU which I don't have, or a cloud service, which is out of scope for me.

The autocomplete + chat only seem to work on their own model, all the other models currently one or the other.

Nonetheles it does look interesting, cheers for the suggestion!

fintechie | karma 2456 | avg karma 40.26 2023-10-19 08:22:00 | [–] similar comments

Have you tried Continue with Phind-CodeLlama-34B-v2? I've been waiting to explore local alternatives to my current go-to tools (Cursor + GPT4) but it seems that local LLMs quality isn't there just yet... ?

imrehg | karma 1272 | avg karma 2.24 2023-10-19 08:31:25 | [–] similar comments

I haven't tried that particular one yet, I might not have enough RAM to run it, though, but gonna give it a try. It seems like one of the larger code-focused local models. Cheers for the suggestion!

gardaani | karma 1429 | avg karma 4.84 2023-10-19 09:43:02 | [–] similar comments

Be careful with Continue. It has opt-out telemetry and by default it seems to send all prompts to the developers. Have a look at ~/.continue/telemetry.log to see what is being sent. Don't enter any passwords, credentials or other sensitive data in prompts unless telemetry is disabled.

behnamoh | karma 20551 | avg karma 4.64 2023-10-19 10:15:46 | [–] similar comments

I guess I should "dis-continue" using Continue.

mycall | karma 1372 | avg karma 1.04 2023-10-19 10:27:55 | [–] similar comments

or you could change how you manage your secrets and magic strings.

Nezteb | karma 304 | avg karma 2.17 2023-10-19 11:28:00 | [–] similar comments

Oof I'm not a fan of their method of disabling telemetry either: https://github.com/continuedev/continue/issues/567

amelius | karma 42902 | avg karma 1.63 2023-10-19 05:57:46 | [–] similar comments

Why does it say "Use GitHub Copilot locally on your Macbook with one-click!" when it obviously doesn't use the Copilot model?

Closi | karma 8474 | avg karma 3.3 2023-10-19 07:07:51 | [–] similar comments

Because it is hijacking the Github Copilot functionality in VS Code.

But yes, confusing notheless.

BoorishBears | karma 6089 | avg karma 1.41 2023-10-19 09:24:50 | [–] similar comments

I'd really love someone to explain how it's confusing: As the parent says, it's dead obvious that it's not going to be GPT-4 running on your Macbook, the title starts by naming it something else, it aims for the very specific style of completion of Github Copilot...

Normally these projects hijack the word 'local' by sending all of your data to a 3rd party API and that is confusing... but we finally get one that runs the model locally, does what it says on the tin, and some people still find a way to paint it as deception?

Closi | karma 8474 | avg karma 3.3 2023-10-20 02:03:04 | [–] similar comments

It’s confusing because it says “GitHub copilot running locally on your mac” when it is actually Llama running locally on your mac, but with the copilot interface.

So the confusion is because what it says it is explicitly different to what it actually is, which understandably will be confusing.

BoorishBears | karma 6089 | avg karma 1.41 2023-10-19 08:51:49 | [–] similar comments

Because it obviously doesn't use the Copilot model: everyone will know what they mean.

shim__ | karma 34 | avg karma 1.31 2023-10-19 06:01:45 | [–] similar comments

Does this require M2 AI capabilities or can it also run on other platforms?

guywhocodes | karma 544 | avg karma 2.26 2023-10-19 06:20:18 | [–] similar comments

Seems to be running on llama.cpp, so it's going to be a question of performance. I don't have any M-cpu but on my 13th gen i5 I can run mistral at about 6.5 tokens per second. Which seems comparable to what this is.

SushiHippie | karma 2531 | avg karma 2.29 2023-10-19 06:25:56 | [–] similar comments

It just uses llama.cpp as a "backend", so anywhere where llama.cpp works this should work too if I see this correctly.

zerop | karma 1277 | avg karma 2.31 2023-10-19 06:20:40 | [–] similar comments

Another OSS alternate - https://continue.dev/

shortrounddev2 | karma 1880 | avg karma 1.84 2023-10-19 07:15:05 | [–] similar comments

Why does it seem like a lot of the local AI tools target mac specifically? Why don't AI developers seem to be able to write cross platform software?

simcop2387 | karma 5031 | avg karma 2.25 2023-10-19 07:26:12 | [–] similar comments

It's because of a quirk of the hardware. The Unified Memory setup that Apple built into the M2 (and M1? not sure) systems means that you have high bandwidth and large amounts of memory available to the GPU inside them, which lets you run LLMs very efficiently, and very easily. Combined with the fact that that part of the hardware is the same across all the machines it makes for a very easy target to get a lot built with minimal effort.

blitzar | karma 5877 | avg karma 2.08 2023-10-19 10:08:50 | [–] similar comments

Quirk / Deliberate design decision

Potato / potato

paulmd | karma 13045 | avg karma 2.72 2023-10-19 14:07:55 | [–] similar comments

The matrix multiplication hardware and data types aren’t standardized yet.

Also, the M1 Max has more bandwidth than an epyc Milan to actually feed all of that. It’s about the same bandwidth as a PS5, but in a mobile package, with none of the latency of GDDR6. Much more powerful than a standard dual-channel consumer cpu.

Havoc | karma 16454 | avg karma 2.29 2023-10-19 07:18:36 | [–] similar comments

Haven’t managed to get the official copilot extension to use local anything. Always seems to ask for a login.

shortrounddev2 | karma 1880 | avg karma 1.84 2023-10-19 07:19:26 | [–] similar comments

You have to pay monthly for it

Havoc | karma 16454 | avg karma 2.29 2023-10-19 13:11:58 | [–] similar comments

If I'm paying for the service then I wouldn't bother using the inferior local hosted one...

wokwokwok | karma 5467 | avg karma 5.25 2023-10-19 07:50:02 | [–] similar comments

Hm… the q4 34B code llama (which is used here) performs quite poorly in my experience.

Using a high quantised larger model gives you an unrealistic impression that smaller models and larger models are roughly equivalently capable… but it’s a trade off. The larger codellama model is categorically better, if you don’t lobotomise it.

It’d be better if instead of making opinionated choices (which aren’t great) it guided you on how to select an appropriate model…

yustee | karma 5 | avg karma 2.5 2023-10-19 08:01:02 | [–] similar comments

Unlike 7B & 13B, the 34B model does not have code infilling which is very useful for code completion

raincole | karma 4769 | avg karma 3.85 2023-10-19 09:46:11 | [–] similar comments

I found the fact Copilot is close sourced -- not just the model, but even the plugins are close sourced -- is very worrying. Good to see efforts on the alternatives.

3abiton | karma 409 | avg karma 0.75 2023-10-19 10:06:52 | [–] similar comments

There is also Phind and WizardLM, but I found their performance to be quite poor tbh. It's hard to match gpt-3.5 quality.

lolinder | karma 32428 | avg karma 5.57 2023-10-19 10:17:18 | [–] similar comments

Can you elaborate on why it's worrying to you? Is it more worrying than, say, Visual Studio being closed source?

manojlds | karma 5720 | avg karma 3.11 2023-10-19 10:28:08 | [–] similar comments

Yes because there's no reason for a visual studio code extension like this to be closed source.

raincole | karma 4769 | avg karma 3.85 2023-10-19 10:51:37 | [–] similar comments

It's worrying because it feels like MS will add more and more closed source "features" to VSCode and undermine its open source-ness.

It also prevent other editors' users from building Copilot plugins. For example, there won't be a Copilot plugin for Emacs that can be accepted by Emacs's official repostiory.

> Visual Studio being closed source

If VScode isn't the de facto universal editor accepted by every programming language's community (notice that even this particular thread is about a VSCode plugin!), I won't be so worried.

adocomplete | karma 291 | avg karma 4.69 2023-10-19 10:46:07 | [–] similar comments

And if you're looking for alternatives - we at Sourcegraph are building Cody - and it is open source.

https://about.sourcegraph.com/cody

all2 | karma 2655 | avg karma 1.78 2023-10-19 13:28:20 | [–] similar comments

I tried to use Cody and the UI was confusing and the completions were unbearably slow. I was using the Rider plugin.

I want Cody to work for me, but right now it doesn't. I really, really want whole project awareness (maybe even to leverage a concurrently running language server?) for my completions.

Do you guys have a usage tutorial or a video somewhere? Are you flexible with how your UI is being implemented (ie, can I pitch you ideas)?

adocomplete | karma 291 | avg karma 4.69 2023-10-19 14:01:42 | [–] similar comments

I'm sorry to hear that. We have made a lot of improvements to Cody recently. We had a big release on Oct 4 that significantly decreased latency while improving completion quality. You can read all about it here: https://about.sourcegraph.com/blog/feature-release-october-2...

We love feedback and ideas as well, and like I said are constantly iterating on the UI to improve it. I'm actually wrapping up a blog post on how to better leverage Cody w/ VS Studio, that'll be out either later today or sometime tomorrow. As far as feedback though: https://github.com/sourcegraph/cody/discussions/new?category... would be the place to share ideas :)

all2 | karma 2655 | avg karma 1.78 2023-10-19 15:26:32 | [–] similar comments

I'll get it installed and try it again. I appreciate the work you guys are doing!

adocomplete | karma 291 | avg karma 4.69 2023-10-19 19:17:27 | [–] similar comments

Awesome! Would love to hear about your experience trying it now and seeing if we're doing any better for your use-case. :)

jmherbst | karma 2 | avg karma 1.0 2023-10-19 13:30:36 | [–] similar comments

Does Cody ship code/context back to sourcegraph at all? My company prohibits this

adocomplete | karma 291 | avg karma 4.69 2023-10-19 14:06:05 | [–] similar comments

We have agreements with the LLMs we work with to not store your prompt data or AI generated responses, or use it for training purposes. Data is only used to generate the response and then deleted.

On Sourcegraph side, we do collect some telemetry to improve our products, and for enterprise use cases we can def work with you on what data Sourcegraph collects/stores and how it interfaces. For example, we recently added support for AWS Bedrock so you can run your own instance of the LLM and connect it. So we def have options we can explore with you.

0xkd | karma 2 | avg karma 0.2 2023-10-20 21:42:15 | [–] similar comments

Tried using this; has great potential but it is too rough right now, an alpha at best - The native cody app makes me sign in everyday for some reason

- the pycharm plugin says I don’t have embeddings but the native app claims otherwise

- when indexing, it complains it cannot find the repo (i assume it is trying to fetch from remote, which is a private github, and not local disk) - i worked around this by removing the remote entirely from git but that is only a temporary solution

- i cannot choose the branch to index (i work on feature branches)

SushiHippie | karma 2531 | avg karma 2.29 2023-10-19 10:14:36 | [–] similar comments

FWIW: you can use any other proxy server for this to any openai compatible api server.

e.g. with mitmproxy and llama-cpp-python server

  python -m llama_cpp.server --n_ctx 4096 --n_gpu_layers 1 --model ./path/to/..gguf

and then with mitmproxy in another terminal

  mitmproxy -p 5001 --mode reverse:http://127.0.0.1:8000

and then set this in your vscode settings.json (the same as for localpilot):

  "github.copilot.advanced": {
      "debug.testOverrideProxyUrl": "http://localhost:5001",
      "debug.overrideProxyUrl": "http://localhost:5001"
  }

works way better for me than localpilot

cgrs | karma 113 | avg karma 8.69 2023-10-19 11:18:35 | [–] similar comments

Just out of curiosity, why do you need a reverse proxy? Couldn't you connect to that openai-compatible server directly?

SushiHippie | karma 2531 | avg karma 2.29 2023-10-19 11:26:29 | [–] similar comments

Thanks, I don't know why I didn't think of that. Just tried it and it works at least with the llama-cpp-python server, but I suppose some other servers and very likely hosted services (SNI, certificate subjects, ...) would have problems with that.

newman314 | karma 3766 | avg karma 2.01 2023-10-19 13:07:58 | [–] similar comments

I'm assuming that this means that I could also potentially connect to a local more powerful box?

SushiHippie | karma 2531 | avg karma 2.29 2023-10-19 14:09:41 | [–] similar comments

Yup

hoschicz | karma 306 | avg karma 1.6 2023-10-19 15:27:17 | [–] similar comments

How is context-from-following-tokens implemented in the pure OAI API? I assumed there must be specialized models that have two separate context windows.

james2doyle | karma 244 | avg karma 4.36 2023-10-19 11:13:26 | [–] similar comments

Looks cool! Always like to see these local alternatives. I'm a Sublime Text user (it is still amazing!) so there aren't many options for LLM assistants. The only one I found that works for me on Sublime is https://codeium.com/ and it is also free for the basic usage.

They have a great list of supported editors:

- Android Studio - Chrome (Colab, Jupyter, Databricks and Deepnote, JSFiddle, Codepen, Codeshare, and StackBlitz) - CLion - Databricks - Deepnote - Eclipse - Emacs - GoLand - Google Colab - IntelliJ - JetBrains - Jupyter Notebook - Neovim - PhpStorm - PyCharm - Sublime Text - Vim - Visual Studio - Visual Studio Code - WebStorm - Xcode

I have found that the completions are decent enough. I do find that sometimes the completion suggestions are too aggressive and try to complete more than I want so I end up leaving it off until I feel like I could use it.

yaroslavyar | karma 4 | avg karma 0.31 2023-10-20 04:13:47 | [–] similar comments

Hey, check the one that I made[1].

It isn’t the code completion assistant like the one you mentioned above, and it probably never will be. I see it more as a perfect coding companion, that is always under your fingertips and relieves you of googling most of the times.

Yet it’s tied with OpenAI, and you have to pay it by yourself, but the former should be changed rather sooner than later.

Bonus: in develop branch there is some-kind-of release candidate that a way more robust that the current release is.

[1]: https://github.com/yaroslavyaroslav/OpenAI-sublime-text

mcbuilder | karma 1245 | avg karma 4.54 2023-10-19 11:25:50 | [–] similar comments

Okay, I actually got local co-pilot set up. You will need these 4 things.

1) CodeLlama 13B or another FIM model https://huggingface.co/codellama/CodeLlama-13b-hf. You want "Fill in Middle" models because you're looking at context on both sides of your cursor.

2) HuggingFace llm-ls https://github.com/huggingface/llm-ls A large language mode Language Server (is this making sense yet)

3) HuggingFace inference framework. https://github.com/huggingface/text-generation-inference At least when I tested you couldn't use something like llama.cpp or exllama with the llm-ls, so you need to break out the heavy duty badboy HuggingFace inference server. Just config and run. Now config and run llm-ls.

4) Okay, I mean you need an editor. I just tried nvim, and this was a few weeks ago, so there may be better support. My expereicen was that is was full honest to god copilot. The CodeLlama models are known to be quite good for its size. The FIM part is great. Boilerplace works so much easier with the surrounding context. I'd like to see more models released that can work this way.

Tokumei-no-hito | karma 75 | avg karma 1.39 2023-10-19 14:48:28 | [–] similar comments

what hardware is needed to run this? could a MBP handle it?

thebiss | karma 123 | avg karma 1.27 2023-10-20 10:38:07 | [–] similar comments

Per the repo, it runs certain models on a Macbook Pro with an Apple M2 Max.

BaculumMeumEst | karma 2123 | avg karma 2.86 2023-10-19 11:52:52 | [–] similar comments

I would love to be able to take a base model and fine-tune it on a handful of hand picked repositories that are A) in a specific language I want to use and B) stylistically similar to how I want to write code.

I’m not sure how possible that is to do, but I hope we can get there at some point.

Legal | privacy