Thanks! Of course they are bit GPU centric but the idea is there.
Very interesting stuff. I wonder both if the Zynq Ultrascale RFSOC PCIe card would work in that chassis and if I could get register level access out of MacOS.
Interesting... what bus does it use if not PCIe? At the driver level I’m guessing it just dumps NVMe packets onto shared memory and twiddle some sort of M1-specific hardware register?
The downside of PCIe is PCIe is very complex. And the tools make interfacing with it bewildering. I really want a PCIe FPGA that looks to me like data magically appears on an AXI bus.
Thanks for the info! Looks like a really awesome little board that's going to allow all KINDS of interesting use cases. I really hope to see a new TuringPi cluster; doubly so with ESXi now being available!
One note about your article:
Due to the nature of PCIe, ANY GPU will happily run on the 1 lane presented here. The limitation will be data transfer, which primarily shows up in texture loading. Tom's Hardware did an article [0] a decade ago comparing lanes to performance. This is also why some x16 connectors will be listed as "electrically" x8; they only have 8 lanes exposed.
However, you will have issues with the power delivery; PCIe GPU's are allowed to pull up to 75 watts from the slot itself before going to direct connectors. There are ways to fix this; I ran a mPCIe (WiFi) to external PCIe adapter for a couple years on an old laptop with a GPU just fine.
Another use case (and likely more relevant) would be plugging in an HBA for a NAS. Upgrading the NIC to something multi-gig sounds interesting, but that PCIe x1 is not going to be any faster than the built-in GBe except in edge cases. I'm actually really curious to see what people do with this.
It looks on their page like it will support standard PCIE but there is a distinct lack of Power headers so I Imagine their new 2x PCIE interface thingy is what they're pushing.
I think Pensando and Xilinx are the networking story, not pcie. The clusters use Cray's slingshot. Within a node, the x64 cores and GPU units are on a common fabric which is also not pcie.
Sounds like this might be a start of a series of articles on HaD, hopefully they'll cover making a custom endpoint. Because that is also my impression that you need beefy fpga and possibly even proprietary ip cores to play with pcie, but it'd be cool to be proven wrong there
This is interesting. I hope someone will do the same to re-add PCIe 4.0 support to AMD AM4 B450 chipset mainboards (it was available temporarily until AMD took it away again in later AGESA versions). Not a trivial hack i reckon.
there are a few FPGA dev kits which are set up this way. I have one which has a dual core Atom CPU and a large FPGA connected via PCIe, and the speed is fast.
With the rise of GPU-compute, a lot of the supercomputers are playing around with faster I/O systems. IBM pushed OpenCAPI / NVLink with Nvidia, and I think that inspired the PCIe ecosystem to innovate.
PCIe standards are including more and more coherent-memory options. It seems like PCIe is trying to become more like Infinity Fabric (AMD) / UltraPath Interconnect (Intel).
reply