Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Pretty much any dGPU, even a small 2GB Intel/AMD one.

If you have a 6GB GPU, it can hold all the weights. My lowly laptop 2060 can can spit out a 200 token response with full context almost immediately.

If you don't have a dGPU, short prompts are OK with fast RAM, but long prompts will be slow.



view as:

Legal | privacy