Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Gamers have a TON of really good really affordable options. But you kind of need 24gb min unless you're using heavy quantization. So 3090 and 4090's are what local llm people are building with (mostly 3090's as you can get then for about $700, and they're dang good)


sort by: page size:

If anything, 24GB is probably the sweet spot for what's optimised for as many local LLM enthusiasts are running 3090s and 4090s (and all want more VRAM).

Gamers though can breathe free, because LLM requires max GPU memory cards, while gamers are fine with smaller memory size cards

I can buy a DDR5 64GB kit from Crucial for $160.

https://www.crucial.com/memory/ddr5/ct2k32g48c40u5

If a $1000 GPU came with that, it would blow everything else out-of-the-water for model size. Speed? No. Model size? Yes.

If it came with 320GB, I could run ChatGPT-grade LLMs. That's $800 worth of DDR5.

Instead, I get 24GB on the 3090 or 4090 for $2k.

A $3k LLM-capable card would not be a hard expense to justify.


Are you aware that cards containing “LLM” (40-80GB) levels of VRAM cost substantially more and the status quo for consumer cards hovers around 4-12GB, only going to 24GB for top end cards?

I picked up a 3090 Ti for this purpose given the price drop. The 24 GB of VRAM is hard to beat.

A 24GB card is not for gaming. It can do games, which is convenient, but no game currently made needs that much.

You can finetune whisper, stable diffusion, and LLM up to about 15B parameters with 24GB VRAM.

Which leads you to what hardware to get. Best bang for the $ right now is definitely a used 3090 at ~$700. If you want more than 24GB vram just rent the hardware as it will be cheaper.

If you're not willing to drop $700 don't buy anything just rent. I have had decent luck with vast.ai


Ok can someone catch me up to speed on LLM hardware requirements? Last I looked I needed a 20 gb vram card to run a good one. Is that not true anymore?

Today is a gaming product announcement and I'm not sure that games have a need even beyond 12 GB today. I suspect that the 4090 price not budging much is telling as far as what ram amount demand has been focused on for games.

I assume they will soon have a professional card announcement that includes 48GB+ cards. Assuming that the high ram cards have improvements similar to this generational leap in the gaming market, they will be in high demand.


24GB is enough for some serious AI work. 48GB would be better, of course. But high end GPUs are still used for other things than gaming, from ML/AI stuff to creative work like video editing, animation renders and more.

Going above 24GB is probably not going to be cheap until gddr7 is out, and even that will only push it to 36gb. The fancier stacked gddr6 stuff is probably pretty expensive and you can’t just add more dies because of signal integrity issues.

Agreed, but two RTX 3090/4090 should be as capable in this regard (having 2x 24GB).

The 24GB VRAM also comes with a ~$2000 price tag. That's not really a regular gamer level card.

If you want to run decently heavy models, I'd recommend getting at a minimum getting 48GB. This allows you to run 34b llama models with ease, 70b models quantized, mixtral without problems.

If you want to run most models, get 64GB. This just gives you some more room to work with.

If you want to run anything, get 128GB or more. Unquantized 70b? Check. Goliath 120b? Check.

Note that high end consumer gpus end at 24GB VRAM. I have one 7900xtx for running llms, and the best it can reliably run is 4-bit quantized 34b models, anything larger is partially in regular ram.


Please give us consumer cards with more than 24GB VRAM, Nvidia.

It was a slap in the face when the 4090 had the same memory capacity as the 3090.

A6000 is 5000 dollars, ain't no hobbyist at home paying for that.


24GB of VRAM is awesome for running AI models though.

If you want to get the most bang for your buck, you definitely need to run quantized versions. Yes, there are models that run in 11G, just like there are models that run in 8G, and for any other amount of VRAM - my point is that 24G is the sweet spot.

> Today's gaming PCs come with 12GB to 20GB of vRAM.

8-20 GB of VRAM


RTX 3090 has 24GB of memory, a quantized llama70b takes around 60GB of memory. You can offload a few layers on the gpu, but most of them will run on the CPU with terrible speeds.
next

Legal | privacy