Yeah I'd be interested in hearing that as well - I think most will take a slight hit in speed for something that's been successfully used in production with several well-known companies.
Me too - let's see what the sustained performance is like. That said, with this much headroom, I'm cautiously optimistic that even with some throttling going on, it'll still be plenty fast for anything I'm likely to throw at it.
Its funny, you see comments like this and I think people confidently give very specific hurdles to them for what these models should be able to do.
I have a rather large spend across the universe of models and I think compared to a year ago, its amazing what is possible. If we continue anywhere close to this speed, it will be amazing what will be possible a year from now.
Right. But there's so much effort, money and reputation invested in various configurations, experimental architectures, etc. that I feel something is likely going to pan out in the coming months, enabling models with more capabilities for less compute.
Well hardware and parameter count are scaling exponentially, so it seems very feasible that it could happen very soon. Of course it's possible that we'll hit a wall somewhere but it seems that just scaling current models up could be enough to get to the point where they can self-improve or gain more compute for themselves
Exactly. Correspondingly, my hope (along with one of the folks below) is that advances in hardware will obliterate any speed difference and enable the back end to be king.
Yes and it should be faster, but I don't want to promise anything until we see what real world usage and performance is like during the beta. Thanks for the feedback, again.
reply