Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

This is just horrible. AMD APUs can easily lose 20-30% of GPU performance with 2400 memory.


sort by: page size:

I was under the impression the 2400G is already heavily memory-bandwidth limited, so unless memory gets a whole lot faster I don't see them packing all that much more into an APU.

sounds to me like a perverse realisation of amd's apu ideal, treating VRAM like system RAM, although in this case they are still disparate memory spaces.

What I'm saying is that GPUs rely on having the memory close to the die so that it actually has enough bandwidth to saturate the cores. System memory is not very close to the CPU (compared to GPUs), so I have doubts about whether an APU would be able to reach GPU levels of performance over gigabytes of model weights.

Should be interesting to see how well this works - the primary issue with these APU designs is matching the amount of available memory bandwidth to the CPU and GPU cores.

Assuming they adapt the conventional Ryzen DDR4 controller to be a 128 bit wide GDDR5 controller with a reasonable 7Gbit/s speed, this would have ~120GB/s of memory bandwidth, which is 3x that of the 2400G (which is regarded as bandwidth starved), but slightly more than half what the Intel + Radeon RX Vega i7-8809G has (which is regarded as not having enough GPU cores to use the provided bandwidth).

Probably a reasonably balanced system. Would be interesting to see if this makes it over to their other embedded lines as a high end option: https://www.amd.com/en/products/embedded-ryzen-v1000-series


This is similar to the Apple GPU trick where unified memory means it's easier to load large models into GPU space, but as others have said, the AMD APU built into these chips is _really_ underpowered, even compared to other RDNA APUs like the one in the Steam Deck

Not exactly. While that may have been the case way back in the past when AMD APUs were designed as a homonegnous CPU-GPU unit from the start, I can tell you that's not the case with the relatively modern 5000 series.

I need to go in the Bios and specify explicitly how much of the system RAM I want allocated exclusively of the integrated GPU and the rest stays available for system RAM.

The reason is because Ryzen 5000 APUs seem to be a job rushed out the door so they're just a Zen 3 CPU with a separate Vega GPU glued together on the same die, but they're not a homogenous design, designed to work as one unit, like the APU of the PS5, so memory wise they're unaware of each other, even though AMD calls them APUs, they're not really, but more like separate CPU and GPU on the same die.

I wish I knew about this limitation at the time, as Intel chips with integrated graphics have unified memory.


A lot of that is due to the low memory bandwidth on mainstream CPU sockets: 128-bit bus, going up to around 6GT/s currently. That's half the memory bandwidth of NVIDIA's current entry-level laptop discrete GPU. And more L3 cache on a CPU directly addresses that memory bandwidth limitation (provided it's accessible to the iGPU, which would certainly be the case if AMD put 3D cache on an APU).

"Right now, consumer GPUs top out at 32GB of VRAM. The M1 Max has, in a sense, 64GB (minus OS baseline overhead) of VRAM for its GPU to use."

We've had AMD APU's for years, you can shove 256GB of RAM in there. But noone cares because a huge chunk of memory attached to a slow GPU is useless.


GPU's have limited amount of memory.

You'd think after AMD shipped Xbox One, XboxX, PS4, and PS5 with decent iGPUs and a decent memory bandwidth they might bless one of their APUs with > 128 bit wide memory. Seems an obvious bottleneck that would improve GPU performance.

It's coming with the Halo Strix with it's 256 bit wide memory, but should it really have taken over a decade?


Yes, these APUs are limited to fairly slow DDR3 RAM. It has been suggested that making the GPU part of the chip bigger wouldn't help because it would be bandwidth limited. There are a couple of possible solutions to this: The PS4 uses GDDR so it's more like a traditional GPU with a CPU added on. Intel's Iris Pro uses a large fast L4 cache. The Xbox One has some fast graphics RAM on chip. AMD will need to do something to increase memory bandwidth if they want to sell processors for more than $200.

Not only that, it's shared memory with the on-chip GPU. Isn't this a big downgrade in both memory capacity and GPU capability?

>"The new A100 with HBM2e technology doubles the A100 40GB GPU’s high-bandwidth memory to 80GB and delivers

over 2 terabytes per second of memory bandwidth."


To some extent , that's not really that impressive. AMD gets access to a breakthrough technology like HBM and 2.5D silicon interposers , and all we get is just a measly 50% improvement in memory bandwidth ?

A more interesting configuration would be attaching 12 1GB HBM chips to the gpu , and achieving a memory bandwidth of 128GB/s * 12 = 1.5Tbyte/sec (would increase power only by 30W over current model).

Maybe their gpu is too weak to support such massive memory bandwidth, and it would be quite hard to do so ?


related:

(74 days ago )

"AMD Ryzen APU turned into a 16GB VRAM GPU and it can run Stable Diffusion"

https://news.ycombinator.com/item?id=37162762


There's a 24 GB limitation for consumer GPUs, because AMD and Intel aren't competitive.

> You can't open a 40GB scene entirely in GPU RAM on any single-GPU system, laptop or otherwise, because there aren't any 40GB+ GPUs.

The second revision of the A100 has 80 GiB of memory: https://www.nvidia.com/en-us/data-center/a100/


gpus don't have more memory bandwidth, which is why they're using a memory-hard problem

It doesn't explain why GPUs that are integrated into a CPU (like AMD's ones) and use the same memory, cannot use it all.

If they could use all memory, you could cheaply run neural networks with 64Gb of RAM without buying a professional GPU. No wonder manufacturers don't want you to be able to do that.

next

Legal | privacy