I assume the stuff from 5 years ago was essentially spitting out a midi output which would be fed in to a traditional tool to play samples. So it's going to sound a lot sharper while being a lot less sophisticated. The real breakthrough here is this is generating everything from scratch and it still resembles the prompt.
One of the automated prompts was "Eminem anger rap", I'm confident if you had showed me the audio without the prompt I could identify which artist it sounded like.
And this is just a basic first attempt at reusing a tool not even designed for audio. I can only imagine how powerful it could be after some trivial revisions like using GPT-3 to generate coherent lyrics.
One of the automated prompts was "Eminem anger rap", I'm confident if you had showed me the audio without the prompt I could identify which artist it sounded like.
And this is just a basic first attempt at reusing a tool not even designed for audio. I can only imagine how powerful it could be after some trivial revisions like using GPT-3 to generate coherent lyrics.
reply