Being able to blend between prompts and attention weightings smoothly from a fixed seed is definitely a fantastic and underexplored avenue; it makes me recall "vector synthesis" common in wavetable synthesizers since the '80s as discussed here[0]. I feel we are just a couple of months from seeing people start using MIDI controllers to explore these kinds of spaces. Something could be hacked together today, but it will be interesting to see once the images can be generated in nearly realtime as the controls are adjusted.
As with Stable Diffusion, text prompting will be the least controllable way to get useful output with this model. I can easily imagine midi being used as an input with control net to essentially get a neural synthesizer.
This is something that has crossed my mind recently... Did you reached any conclusions/results? How did you layer the different midi tracks? It would be pretty cool to build a VSTi or AU with some AI implemented...
Google has actually released Magenta [0] which is a plugin within Ableton that can do just that. It's pretty cool, it generates interesting melodic content at the click of a button.
This is great! I found the more robust examples really useful, like the ones that show notes and triads in a key. I'd love to have something like this in the form of a VST or something usable in Ableton.
I've been wondering if machine learning or metaheuristics could be used to get close to an input sound. I've been toying with this idea using MIDI CC with a Rytm.
This is incredible. I'd love to see something similar for sound synthesis (how do you craft that perfect piano/violin/dubsteppy sound?) and mixing/mastering.
> This is usually called "MIDI mapping" or similar and is available in basically every DAW these days.
I'm aware of MIDI mapping, my question was about the possibility to make a stand alone sysnthesizer (with keyboard and knobs) in which the necessary controls would be read from analog inputs while still displaying them on the screen as if they were changed with the mouse, which is not easy to do when playing live.
> Something like qjackctl (with Jack, obviously) could do this for you, as you can route things however you want in a drag-and-drop UI.
Sorry, my second sentence wasn't clear at all. I'd love to see a general purpose (not related to sound or any other use) library to allow the graphical representation of structures links (list nodes) together as this software and others do with generators, effects, etc. Ideally, it should operate on the header of a structure which contains the relevant fields for linking to others. When I alter the links on the screen, it does the same on the represented nodes.
That would be the way I would build for example a drum machine in which each structure contains also a pattern and I can move them at will back and forth, replicate them, set their own fields (number of repeats, etc). This would be again a sound related application, but what I'm looking for is something really general purpose.
I'd like to try using this kinda thing to build an automated beat saber map. The ability to orchestrate the beats very specifically would make for excellent mappings.
Interesting nonetheless. I keep waiting for someone to make one of these with BPM sync and the capability to jump to pre-set cue points in response to MIDI or OSC control messages.
I even wrote one like that in multithreaded C some years back - it leaked memory like a sieve, but it was tons of fun to do breakbeat juggling in it. (When I last revisited that, it didn't compile anymore and I couldn't even read the code.)
All in all it's always nice to see devs honing their skills with multimedia stuff instead of bureaucratic CRUD types of things.
I am actually very interested in its application to musical patterns, ie the actual notes rather than the audio. I think there's already a tool that uses this to generate rich and musically-correct MIDI on the fly but I'm having trouble remembering the name/manufacturer now. Future Retro maybe.
That sounds challenging - I'd love to see how you get on. I had a kind of related idea - train a model on a particular synth, then when passed a sample with a synth sound in it, it would try to suggest the patch settings that would most closely emulate the sound.
Another way of getting to similar results without hardware hacking is via using midi controllers combined with software to translate midi messages to keystrokes (and/or macros).
I think my favourite part of this is being able to draw your own waveform and then play it. Super cool. You can learn about something so much faster when it's right there and instantly tweakable.
While interesting, I'm skeptical an algorithmic approach will ever come close to a decent wavetable-based synth (using samples/noises from the real instrument) or scripted sampler instrument (like Kontakt). It might help if you can't otherwise find an existing synth/sample-based instrument but can find it in use, but those are few and far between.
[0] https://www.soundonsound.com/techniques/synth-school-part-7
reply