Nvidia's entire DGX and Maxwell product line was subsidized by Aurora's precursor, and Nvidia worked very closely with Argonne to solve a number of problems in GPU concurrency.
A lot of the foundational models used today were trained on Aurora and its predecessors, as well as tangential research such as containerizarion (eg. In the early 2010s, a joint research project between ANL's Computing team, one of the world's largest Pharma companies, and Nvidia became one of the largest customers of Docker and sponsored a lot of it's development)
National labs sign "cost-effective" deals. NVIDIA isn't cost-effective. Aurora (at Argonne) is all Intel GPU. Aurora is also a clusterfuck so that just tells you these decisions aren't made by the most competent people.
The money didn't come to Nvidia immediately. They were in exactly the same spot as ATI when they introduced hardware shaders in GF3 and later pioneered GPGPU on them. Moreover, ATI sometimes progressed in huge leaps (such as Radeon 9800Pro which was miles ahead of anything from Nvidia). ATI and then AMD just ignored general purpose massively parallel computations for a while, and then didn't know what to do with it, while Nvidia had a vision and actually implemented it.
They did give out a bunch of free GPUs to universities, but more than that, they have invested heavily and deliberately in HPC: community engagement, better SDKs (it's been almost a decade since I've first been able to build a Linux executable against CUDA, and that code should still work), server SKUs sold in the server channel (AMD didn't bother to design SKUs for servers until it was too late). Other things that solidified NVIDIA's lead were the AWS win (in 2010, they managed to partner with AWS to bring GPU instances to market - and those are thriving) and the 2011 ORNL Titan win (important for the HPC community mindshare).
They built cuda and embedded it as a foundational technology through university partnerships and generally paying attention to developers. Wrote documentation, talked about their tech a lot, generally made the onboarding broadly work.
They then had the good fortune that their competitors tried to leverage open source and cross company collaboration. While misc open source winning is sometimes thing, the industry built all their stuff on cuda instead of opencl or openmp.
Nvidia thus built their gold mine.
I don't think it's impenetrable. There are some design mistakes baked into their system that are going to be difficult to unravel without breaking backwards compatibility.
NVidia has been great for this. Back in 2009 they gave me 2 Teslas (C2060) and 2 Quadro FX 5800 cards. Then, I was able to get some results from those to get an NVidia Professor Partnership grant of $25000 (as a post doc). They had a large part in launching my engineering career doing HPC.
NVIDIA does a great job at dragging their heels on OCL support and a heavy marketing push on CUDA. IMO they produce a much higher-quality product than AMD.
If I were NVIDIA I'd probably donate scores of servers+GPUs to schools like Caltech in order to inspire curriculum just like this.
NVIDIA wasamazing. They drove a bulldozer through the walled garden of HPC Welfare Queens (ask any grad student from the 00s trying to get supercomputer access what GPUs meant to them) and set the stage for the current AI boom.
IMO they deserve an entire fulfillment center of ice cream for that alone.
> Nvidia started aggressively seeding CUDA and GPUs for research in the early 2010s
I was at a niche graphics app startup circa 2000-2005 and even then NVidia invested enough to be helpful with info and new hardware, certainly better than other GPU companies. Post 2010 I was at a F500, industry leading tech company and an NVidia Biz Dev person came to meet with us every quarter usually bearing info and sometimes access to new hardware.
It's also worth noting that NVidia has consistently invested more than their peers in their graphics drivers. While the results aren't always perfect, NVidia usually has the best drivers in their class.
NVidia got lucky, at least in the long term. They spent years perfecting SIMD to make spaceships blow up and, by coincidence, that's the same technology that enabled coin mining then deep learning.
NVidia was founded by ex-SUN ex-AMD folks. 3dfx was founded by ex-SGI folks. SUN used to compete with SGI in the high-end graphics workstation space so in a sense NVidia consolidated them both. AMD on the other hand consolidated independent pioneers like ATI, Tseng Labs etc.
> A great example of long term strategic thinking, execution and commitment, that's quite rare in tech today.
People were already using GPUs to solve more general problems than real-time graphics. In particular, folks like Mike Houston or Aaron Lefohn (both now at NVidia) worked to formalize the sorts of computations that could efficiently be mapped onto the limited hardware that existed at the time.
NVidia's management did see that opportunity to expand to the HPC market and spent a good chunk of money for years to become a real contender in that space.
Something I loved about working there was laser focus on one product and finding ways to deliver value to their customers with that one product. It was a very different paradigm from working on SoC (system on a chip) devices like cell phones, where many teams competed to gain silicon area in the chip.
I interned at NVIDIA in 2009 on the kernel mode driver team. Was super fun there in terms of the project work and the people. If the code still exists, I created the main class that schedules work out to the GPU on Windows.
That level of programming gave such rewarding moments in between difficult debugging sessions. When I wanted to test a new kernel driver build I needed to walk into some massive room with all of these interconnected machines that emulated the non-yet-fabricated GPU hardware. One of the full time people on my team was going insane trying to track down a memory corruption issue between GPU memory and main memory when things paged out the entire time I was there.
Back then the stock was around $7/share and the CEO announced a 10% paycut across the board (even including my intern salary) and had an all hands with everyone in the cafeteria. It's pretty cool they went from that vulnerable state, with Intel threatening to build in GPU capabilities, to the powerhouse they are today.
In the early 2000s it was a common theme in the CS literature that the Von Newman architecture was about to run out of steam and some kind of parallel processor was going to become mainstream, out of all that work the GPGPU became the most famous. If NVIDIA hadn’t done it, someone else would have.
I love this, I was amongst early engineers on CUDA (compilers).
NVIDIA was so well run, but boxed into a smaller graphics card market - ATI and it were forced into low margins since they were made replaceable by OpenGL and DirectX standards. For the standard fans - they resulted a wealth transfer from NVIDIA to Apple etc. and reduced capital available for R&D.
NVIDIA was constantly attacked by a much bigger Intel (which changed interfaces to kill products and was made to pay by a court)
Through innovation, developing new technologies (CUDA) they increased market cap, and have used that to buy Arm/Mellanox.
I love the story of the underdog run by a founder, innovating it’s way to getting into new markets against harsh competition. Win for capitalism!
A lot of the foundational models used today were trained on Aurora and its predecessors, as well as tangential research such as containerizarion (eg. In the early 2010s, a joint research project between ANL's Computing team, one of the world's largest Pharma companies, and Nvidia became one of the largest customers of Docker and sponsored a lot of it's development)
reply