Hacker Read

mk_stjames | karma 2959 | avg karma 8.78 · 2023-09-27 20:56:20

I have it running on my Mac using ollama now; Does it say anywhere what quantization scheme is being used? ollama seems a bit opaque here.

When it downloaded the model it only downloaded about 4GB. Which, for a 7.3B parameter model implies that it's 4-bit quantized. But I don't see that listed anywhere (or an option to use, say, Q8 instead)

If this is the case I'm pretty impressed with a quick tinker, it feels pretty coherent for a 7B @ Q4.

reply

nulld3v | karma 956 | avg karma 3.61 · 2023-09-27 22:59:03

> I have it running on my Mac using ollama now; Does it say anywhere what quantization scheme is being used? ollama seems a bit opaque here.

It's on the tags tab: https://ollama.ai/library/mistral/tags

reply

mk_stjames | karma 2959 | avg karma 8.78 · 2023-09-27 23:10:39

Thanks! Wow that is excellent that all those different quants are provided.