Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Google's marketing materials said it's slightly better than GPT-4 across benchmarks. I'll be checking leaderboards on Huggingface over the next few days for independent confirmation.


sort by: page size:

So it's basically just GPT-4, according to the benchmarks, with a slight edge for multimodal tasks (ie audio, video). Google does seem to be quite far behind, GPT-4 launched almost a year ago.

Google's own benchmarking shows that Gemini Pro is just slightly better than GPT 3.5 and Gemini Ultra is comparable to GPT 4 (see their technical paper).

I still don't trust benchmarks, but they've come a long way.

It's genuinely outperforming GPT4 in my manual tests.


Ultra benchmarked around the original release of GPT-4, not the current model. My understanding is that was fairly accurate — it's close to current GPT-4 but not quite equal. However, close-to-GPT-4 but 4x cheaper and 10x context length would be very impressive and IMO useful.

Is this going to beat both gpt4t and gpt4o in benchmarks?

It's not really comparative to GPT-4. It's comparable to Google Search.

I use this over Google very frequently but it's still not as good as gpt4

They’re extrapolating from the performance of GPT-3.5. It’s speculative, but not anecdotal. GPT has improved rapidly over time, so it's not a huge leap to predict that GPT-4 will be even better.

In my experience it's better than GPT3.5, not as good as GPT4.

It beats GPT 3.5 in some benchmarks, the first open model to do so I believe.

Versions being worked on now will do much better.

GPT 4 is far better and will likely not be beaten by any current open models and approaches but maybe an ensemble of them.


The whitepaper has a few benchmarks vs. GPT-4. Most are reported benchmarks, though. Most of the blogs/news articles I've seen mention Google's push to focus on GPT-3.5. Found the whitepaper table way better at summarizing this. https://storage.googleapis.com/deepmind-media/gemini/gemini_...

That is fair, your post left it a bit ambiguous if you meant better in reference to GPT-4 or not.

Competitors aren't even at GPT 3.5.


Wait, really? I've only been using GPT4 and it seemed like it's been getting incrementally better. Do you have any test cases?

The article contains benchmarks to those tests. On several it is better than GPT-3.5.

In terms of performance GPT-4o doesn't seem like an improvement over GPT-4 (even worse in some cases afaiu)

And Google showcased the project astra thing, which seems like the equivalent.


State of the art is still GPT-4? Others are playing catch up or hitting very similar benchmarks

According to their benchmarks it is superior to GPT-3.5

The performance results here are interesting. G-Ultra seems to meet or exceed GPT4V on all text benchmark tasks with the exception of Hellaswag where there's a significant lag, 87.8% vs 95.3%, respectively.

No race has begun. GPT 4 is so far ahead in everything. Even in their official metrics[1], and that reports official metrics for first version of GPT 4 from paper. People have ran the benchmarks again and found much better results like 85% HumanEval. It's like no one even thinks about comparing to GPT 4 and it is just reported as gold standard.

[1]: https://x.ai/

next

Legal | privacy