Hacker Read

GGUF is just a file format. The ability to offload some layers to CPU is not specific to it nor to llama.cpp in general - indeed, it was available before llama.cpp was even a thing.

sgolem | karma 2 | avg karma 0.29 · | 2023-09-29 07:03:11

Can this be run with llama.cpp?

aktenlage | karma 197 | avg karma 2.74 · | 2023-12-13 14:28:23

I don't know the answer to your question, but did you know you can download the standalone llamafile-server executable and use it with any gguf model?

zan2434 | karma 678 | avg karma 4.16 · | 2023-04-12 13:18:05

Anyone wanna convert this to GGML so we can run it with LLaMa.cpp?

stavros | karma 66636 | avg karma 10.05 · | 2023-08-13 18:36:27

Thanks, maybe it's as easy as downloading the ggml and running it with Llama.cpp. I'll try that, thanks!

Kelamir | karma 342 | avg karma 1.68 · | 2023-06-02 01:11:09

Same question, so far I've found this thread https://github.com/ggerganov/llama.cpp/issues/1602 where people work on it.

SushiHippie | karma 2531 | avg karma 2.29 · | 2023-10-19 06:25:56

It just uses llama.cpp as a "backend", so anywhere where llama.cpp works this should work too if I see this correctly.

ReactiveJelly | karma 3522 | avg karma 3.39 · | 2023-11-01 11:19:59

So... this won't work with llama.cpp right? I need a different runtime for it? I'm new to this LLM stuff.

TechBro8615 | karma 7947 | avg karma 3.3 · | 2023-06-06 12:04:21

I believe ggml is the basis of llama.cpp (the OP says it's "used by llama.cpp")? I don't know much about either, but when I read the llama.cpp code to see how it was created so quickly, I got the sense that the original project was ggml, given the amount of pasted code I saw. It seemed like quite an impressive library.

Joeri | karma 11462 | avg karma 3.73 · | 2023-05-06 04:51:30

No, llama.cpp only works with llama-based models, like base llama, alpaca, vicuna, ...

tarruda | karma 2401 | avg karma 4.66 · | 2024-01-25 02:54:21

It is based on Mistral which llama.cpp supports, so I assume it does run (you might need to convert to GGUF format and quantize it).

jawerty | karma 611 | avg karma 2.23 · | 2023-09-25 22:28:54

Absolutely, I'll add llama.cpp support soon.

azinman2 | karma 14541 | avg karma 2.7 · | 2024-01-25 01:40:50

I assume this doesn’t yet run on llama.cpp?

logicallee | karma 3076 | avg karma 0.77 · | 2023-04-01 12:49:12

It's possible to run llama.cpp on windows, e.g. see this tutorial:

https://www.youtube.com/watch?v=coIj2CU5LMU

Would this version (ggerganov) work with one of those methods?

reply