I was under the impression the 2400G is already heavily memory-bandwidth limited, so unless memory gets a whole lot faster I don't see them packing all that much more into an APU.
sounds to me like a perverse realisation of amd's apu ideal, treating VRAM like system RAM, although in this case they are still disparate memory spaces.
What I'm saying is that GPUs rely on having the memory close to the die so that it actually has enough bandwidth to saturate the cores. System memory is not very close to the CPU (compared to GPUs), so I have doubts about whether an APU would be able to reach GPU levels of performance over gigabytes of model weights.
Should be interesting to see how well this works - the primary issue with these APU designs is matching the amount of available memory bandwidth to the CPU and GPU cores.
Assuming they adapt the conventional Ryzen DDR4 controller to be a 128 bit wide GDDR5 controller with a reasonable 7Gbit/s speed, this would have ~120GB/s of memory bandwidth, which is 3x that of the 2400G (which is regarded as bandwidth starved), but slightly more than half what the Intel + Radeon RX Vega i7-8809G has (which is regarded as not having enough GPU cores to use the provided bandwidth).
This is similar to the Apple GPU trick where unified memory means it's easier to load large models into GPU space, but as others have said, the AMD APU built into these chips is _really_ underpowered, even compared to other RDNA APUs like the one in the Steam Deck
Not exactly. While that may have been the case way back in the past when AMD APUs were designed as a homonegnous CPU-GPU unit from the start, I can tell you that's not the case with the relatively modern 5000 series.
I need to go in the Bios and specify explicitly how much of the system RAM I want allocated exclusively of the integrated GPU and the rest stays available for system RAM.
The reason is because Ryzen 5000 APUs seem to be a job rushed out the door so they're just a Zen 3 CPU with a separate Vega GPU glued together on the same die, but they're not a homogenous design, designed to work as one unit, like the APU of the PS5, so memory wise they're unaware of each other, even though AMD calls them APUs, they're not really, but more like separate CPU and GPU on the same die.
I wish I knew about this limitation at the time, as Intel chips with integrated graphics have unified memory.
A lot of that is due to the low memory bandwidth on mainstream CPU sockets: 128-bit bus, going up to around 6GT/s currently. That's half the memory bandwidth of NVIDIA's current entry-level laptop discrete GPU. And more L3 cache on a CPU directly addresses that memory bandwidth limitation (provided it's accessible to the iGPU, which would certainly be the case if AMD put 3D cache on an APU).
You'd think after AMD shipped Xbox One, XboxX, PS4, and PS5 with decent iGPUs and a decent memory bandwidth they might bless one of their APUs with > 128 bit wide memory. Seems an obvious bottleneck that would improve GPU performance.
It's coming with the Halo Strix with it's 256 bit wide memory, but should it really have taken over a decade?
Yes, these APUs are limited to fairly slow DDR3 RAM. It has been suggested that making the GPU part of the chip bigger wouldn't help because it would be bandwidth limited. There are a couple of possible solutions to this: The PS4 uses GDDR so it's more like a traditional GPU with a CPU added on. Intel's Iris Pro uses a large fast L4 cache. The Xbox One has some fast graphics RAM on chip. AMD will need to do something to increase memory bandwidth if they want to sell processors for more than $200.
To some extent , that's not really that impressive. AMD gets access to a breakthrough technology like HBM and 2.5D silicon interposers , and all we get is just a measly 50% improvement in memory bandwidth ?
A more interesting configuration would be attaching 12 1GB HBM chips to the gpu , and achieving a memory bandwidth of 128GB/s * 12 = 1.5Tbyte/sec (would increase power only by 30W over current model).
Maybe their gpu is too weak to support such massive memory bandwidth, and it would be quite hard to do so ?
It doesn't explain why GPUs that are integrated into a CPU (like AMD's ones) and use the same memory, cannot use it all.
If they could use all memory, you could cheaply run neural networks with 64Gb of RAM without buying a professional GPU. No wonder manufacturers don't want you to be able to do that.
reply