I saw deepgrams claims as well an believed them also, then i tried it, it was TERRIBLE. Don't believe them. It only does well on the benchmark they trained it on. It is faster though but the quality is terrible.
Did you try their enhanced models? We're using it for relatively high-quality audio files and their accuracy is better than the whisper small.en model. More importantly, their word level timestamps is worlds better than whisper.
reply