How a Shifting AI Chip Market Will Shape Nvidia's Future

zdyn5 | karma 282 | avg karma 5.64 · 2024-02-26 02:31:30

From a high-level design standpoint, wouldn’t the general-purposeness of NVIDIA’s GPUs (even if they do have some AI/LLM optimizations) put them generally at a disadvantage compared to more custom/dedicated inference designs? (Disregarding real-world issues like startup execution risks, assume competitors succeed at their engineering goals) Or is there some fundamental architectural reason why NVIDIA can/will always be highly competitive in AI inference? Is the general-purposeness of the GPU not as much of an overhead/disadvantage as it seems?

Also how critical is NVIDIA’s infiniband networking advantage when it comes to inference workloads?

reply

p1esk | karma 6022 | avg karma 1.5 · 2024-02-26 02:45:09

Custom chips have to be much better than Nvidia to become attractive. Being 2x faster won’t be enough, 5x faster might be. Assuming perfectly functioning software.

zdyn5 | karma 282 | avg karma 5.64 · 2024-02-26 02:58:35

Is software that important on the inference side, assuming all the key ops are supported by the compiler? Once the model is quantized and frozen the deployment to alternative chips while somewhat cumbersome hasn’t been too challenging, at least in my experience with Qualcomm NPU deployment (trained on NVIDIA)

p1esk | karma 6022 | avg karma 1.5 · 2024-02-26 04:49:41

Let me put it this way: if there’s even a slightest issue with my Pytorch code (training or inference) running on a non Nvidia chip it will be an automatic no from me. More than that - if I simply suspect there will be any issues I will not try it. Regardless of any promised speedups.

Whoever wants to sell me their chip better do an amazing demo of their flawless software integration.

reply

tester756 | karma 3905 | avg karma 1.96 · 2024-02-26 22:48:45

What an approach.

It is very simple math

If the savings on hw/compute are greater than cost of adjustments, then it is probably worth it.

So, if you prefer to avoid spending e.g 1 month on adjusting and testing just to keep using e.g 1.x more expensive hw, then it is your loss in long run

reply

p1esk | karma 6022 | avg karma 1.5 · 2024-02-27 00:20:27

The problem is I don’t know how much time it will take to make it work, or if it’s even possible to make it work for my specific situation.

I’ve wasted enough of my time trying to debug AMD and Graphcore chips to fall into this trap again.

reply