Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

My general conclusion: Gemini Ultra > GPT-4 > Gemini Pro

See Table 2 and Table 7 https://storage.googleapis.com/deepmind-media/gemini/gemini_... (I think they're comparing against original GPT-4 rather than GPT-4-Turbo, but it's not entirely clear)

What they've released today: Gemini Pro is in Bard today. Gemini Pro will be coming to API soon (Dec 13?). Gemini Ultra will be available via Bard and API "early next year"

Therefore, as of Dec 6 2023:

SOTA API = GPT-4, still.

SOTA Chat assistant = ChatGPT Plus, still, for everything except video, where Bard has capabilities . ChatGPT plus is closely followed by Claude. (But, I tried asking Bard a question about a youtube video today, and it told me "I'm sorry, but I'm unable to access this YouTube content. This is possible for a number of reasons, but the most common are: the content isn't a valid YouTube link, potentially unsafe content, or the content does not have a captions file that I can read.")

SOTA API after Gemini Ultra is out in ~Q1 2024 = Gemini Ultra, if OpenAI/Anthropic haven't released a new model by then

SOTA Chat assistant after Bard Advanced is out in ~Q1 2024 = Bard Advanced, probably, assuming that OpenAI/Anthropic haven't released new models by then



view as:

SOTA does not require being productionized. eg. GPT-3 was SOTA and it was not publicly accessible.

There has to be some way to verify the claim. Trust me bro isn't science.

"Trust that I ran these tests with these results" is extremely common in science.

It's not an objective test like you are talking about. These benchmarks are far from accurate and also can be tainted in the training data.

You'll find the same thing in many academic/scientific papers

The trust is established by others reproducing the results with the same methodology, it's not just supposed to be taking people's word at face value

Legal | privacy