Hacker Read

gooseus | karma 2377 | avg karma 5.06 · 2023-03-23 16:49:04

OpenAI has been collecting a ton of evals here https://github.com/openai/evals with many of them including some comments about how well GPT-4 does vs GPT-3.5.

You could clone that repo, adapt the oaieval script to run against different APIs, then run the evals against both and compare the results.

reply