Hacker Read

og_kalu | karma 4856 | avg karma 2.4 · 2023-04-22 14:50:12

Fair on the wording I suppose but

First of all, the dataset used for evaluation was created by those researchers, weighing it in their favor.

Second, GPT-4 still performs better in 6 of those. Hardly 1 or 2. And when it doesn't, it's usually very close.

All of this is to say that GPT-4 will smoke any bespoke NLP model/API which is the main point.