Hacker Read

tome | karma 8885 | avg karma 1.75 · 2024-02-19 18:02:10

Really glad you like it! We've been working hard on it.

lokimedes | karma 1242 | avg karma 4.67 · 2024-02-19 18:04:28

The speed part or the being swallowed part?

tome | karma 8885 | avg karma 1.75 · 2024-02-19 18:15:54

The speed part. We're not interested in being swallowed. The aim is to be bigger than Nvidia in three years :)

dazzaji | karma 75 | avg karma 1.92 · 2024-02-19 18:21:22

Go for it!

nurettin | karma 3395 | avg karma 1.38 · 2024-02-19 21:46:28

Can you warn us pre-IPO?

tome | karma 8885 | avg karma 1.75 · 2024-02-19 21:48:37

I'm sure you'll hear all about our IPO on HN :) :)

FpUser | karma 2811 | avg karma 0.61 · 2024-02-19 23:16:53

Yes please

jonplackett | karma 9277 | avg karma 3.66 · 2024-02-19 23:18:41

Is Sam going to give you some of his $7T to help with that?

NicoJuicy | karma 10294 | avg karma 1.47 · 2024-02-21 05:50:51

Why wouldn't NVIDIA release their own LPU?

jonplackett | karma 9277 | avg karma 3.66 · 2024-02-19 23:30:08

Is this useful for training as well as running a model. Or is this approach specifically for running an already-trained model faster?

frozenport | karma 2997 | avg karma 1.05 · 2024-02-19 23:36:54

In principle, training is basically the same as running inference but iteratively, in practice training would use a different software stack.

robrenaud | karma 2293 | avg karma 4.44 · 2024-02-20 02:16:44

Training requires a lot more memory to keep gradients + gradient stats for the optimizer, and needs higher precision weights for the optimization. It's also much more parallelizable. But inference is kind of a subroutine of training.

tome | karma 8885 | avg karma 1.75 · 2024-02-19 23:38:47

Currently graphics processors work well for training. Language processors (LPUs) excel at inference.

lynguist | karma 835 | avg karma 2.62 · 2024-02-20 12:22:24

Did you custom build those Language processors for this task? Or did you repurpose something already existing? I have never heard anyone use ‘Language processor’ before.

tome | karma 8885 | avg karma 1.75 · 2024-02-20 14:17:53

The chips are built for general purpose low latency, high throughput numerical compute.