Hacker Read

tikkun | karma 2720 | avg karma 3.01 · 2023-12-06 10:31:48

My general conclusion: Gemini Ultra > GPT-4 > Gemini Pro

See Table 2 and Table 7 https://storage.googleapis.com/deepmind-media/gemini/gemini_... (I think they're comparing against original GPT-4 rather than GPT-4-Turbo, but it's not entirely clear)

What they've released today: Gemini Pro is in Bard today. Gemini Pro will be coming to API soon (Dec 13?). Gemini Ultra will be available via Bard and API "early next year"

Therefore, as of Dec 6 2023:

SOTA API = GPT-4, still.

SOTA Chat assistant = ChatGPT Plus, still, for everything except video, where Bard has capabilities . ChatGPT plus is closely followed by Claude. (But, I tried asking Bard a question about a youtube video today, and it told me "I'm sorry, but I'm unable to access this YouTube content. This is possible for a number of reasons, but the most common are: the content isn't a valid YouTube link, potentially unsafe content, or the content does not have a captions file that I can read.")

SOTA API after Gemini Ultra is out in ~Q1 2024 = Gemini Ultra, if OpenAI/Anthropic haven't released a new model by then

SOTA Chat assistant after Bard Advanced is out in ~Q1 2024 = Bard Advanced, probably, assuming that OpenAI/Anthropic haven't released new models by then

reply

charcircuit | karma 2002 | avg karma 0.38 · 2023-12-06 11:47:13

SOTA does not require being productionized. eg. GPT-3 was SOTA and it was not publicly accessible.

nightski | karma 6955 | avg karma 2.42 · 2023-12-06 12:21:46

There has to be some way to verify the claim. Trust me bro isn't science.

gpm | karma 17771 | avg karma 3.89 · 2023-12-06 12:25:14

"Trust that I ran these tests with these results" is extremely common in science.

nightski | karma 6955 | avg karma 2.42 · 2023-12-06 12:39:32

It's not an objective test like you are talking about. These benchmarks are far from accurate and also can be tainted in the training data.

verdverm | karma 6501 | avg karma 0.84 · 2023-12-06 13:46:31

You'll find the same thing in many academic/scientific papers

hughesjj | karma 1575 | avg karma 2.4 · 2023-12-06 15:22:08

The trust is established by others reproducing the results with the same methodology, it's not just supposed to be taking people's word at face value