How much gpu do i need to run a 90b model

1 month ago

How much gpu do i need to run a 90b model

hendrik@palaver.p3x.de · edit-2 1 month ago

I’d say you’re looking for something like a 80GB VRAM GPU. That’d be industry grade (an Nvidia A100 for example).

And to squeeze it into 80GB the model would need to be quantized to 4 or 5 bits. There are some LLM VRAM calculators available where you can put in your numbers, like this one.

Another option would be to rent these things by the hour in some datacenter (at about $2 to $3 per hour). Or do inference on a CPU with a wide memory interface. Like an Apple M3 processor or an AMD Epyc. But these are pricey, too. And you’d need to buy them alongside an equal amount of (fast) RAM.