Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Impressive speed. Are there any plans to run fine-tuned models?


sort by: page size:

Yeah I'd be interested in hearing that as well - I think most will take a slight hit in speed for something that's been successfully used in production with several well-known companies.

It's going to get faster. That's why it's still experimental. ;)

if the speedups pan out, that is the plan! the work on performance has not been started yet however.

There's a lot of use cases just waiting for a good system that can do at least live speed generation.

Extremely impressive. Does anyone know if performance is likely to decrease as more features are implemented? Because if not, this is a winner.

it will be interesting to see if they can start to load models like GPT-3 ASICs and realize some serious performance gains.

We'll see how fast it is on consumer hardware once decent quantisations are available.

Woah cool. That plus fast/optimized hardware will probably be it.

Me too - let's see what the sustained performance is like. That said, with this much headroom, I'm cautiously optimistic that even with some throttling going on, it'll still be plenty fast for anything I'm likely to throw at it.

Its funny, you see comments like this and I think people confidently give very specific hurdles to them for what these models should be able to do.

I have a rather large spend across the universe of models and I think compared to a year ago, its amazing what is possible. If we continue anywhere close to this speed, it will be amazing what will be possible a year from now.


Indubitably, good fellow.

I suspect if we can fine tune and optimize this 65B model, we can achieve some truly remarkable results.


This is a welcome improved spec! I'll be curious how long till wide adoption.

Right. But there's so much effort, money and reputation invested in various configurations, experimental architectures, etc. that I feel something is likely going to pan out in the coming months, enabling models with more capabilities for less compute.

I like where they are going. Are there benchmarks out yet?

Well hardware and parameter count are scaling exponentially, so it seems very feasible that it could happen very soon. Of course it's possible that we'll hit a wall somewhere but it seems that just scaling current models up could be enough to get to the point where they can self-improve or gain more compute for themselves

I expect the performance here to be blazingly fast...is that an accurate assumption?

Exactly. Correspondingly, my hope (along with one of the folks below) is that advances in hardware will obliterate any speed difference and enable the back end to be king.

Time will tell.


Yes and it should be faster, but I don't want to promise anything until we see what real world usage and performance is like during the beta. Thanks for the feedback, again.

Ooooh, this is interesting. I'll see what the performance looks like.
next

Legal | privacy