>Btw transparency for this now-viral thread: I didn’t just paste prompts into dall-e, I played with style (eg. cyberpunk, oil, etc) to keep it interesting and diverse
>If I had to quantify, I’d say I’d generate 2 or 3 batches (tweaking prompt) before choosing my fav two pics, each batch outputs 20 images (two tabs 10 per), so prob technically cherry picked 2 out of 60. That said usually other 58 weren’t really broken, just boring / bit less fun
> you have to think of the underlying labeled text-to-image sets as paint colors to mix, and prepare a palette accordingly.
Very insightful tip on how to harness the "creativity" of Dall-E and the like.
I see how the phrase "king of belgium" was too vague for Dall-E, so it didn't produce anything recognizable - but changing the words into known details, like "banker" and "salt and pepper hair", worked effectively to generate concrete imagery.
interesting.. but my guess is it's using a big library of generated image and prompt pairs? So all its suggested prompts are right out of someones 'stable diffusion prompt cheatsheet.pdf' . That is to say overly outputting commonly known artists, and things like 'trending on deviant art'
none and as an experienced user you should know that's it's not one shot and most of the time not even few shot... You can't compare cherry picked press images with few shots of a 5 second prompt. I don't know why you want to hype something up if you can't really compare it. It seems extremly attention grifting.
Just look at their cherry picks in this discord... https://discord.com/invite/pxewcvSvNx .
It's overfitted on images with copyright (afghan girl) and doesn't show more "compositional power" at all most of the time ignoring half of the prompt.
And it's extremely funny since most of these text to image models were container Shutterstock watermarked images, to the point it thinks humans see the world with a big shutterstock watermark across it with certain prompts.
From the videos I have seen of it in use, there is no way the prompts had existing material. One prompt was “painting of a goat in the style of the Mona Lisa taking photos with an iPad”. And it spat out 10 images which showed exactly that.
Wanted to give it a try just for fun, using the same prompts, base model and parameters (as far as I can tell), and the first 5 images that were created... will probably haunt me in my dreams tonight.
I don't know if it was me misconfiguring it, or if the images in post were really cherry-picked.
Yes! So now you can guess why he was loading a specific different parameter file for each picture!
Translation: someone tried many different parameters for each image in this demo and them manually selected the ones that make a better result.
This might still be useful if there is a good UI for users to do the same.
But then a human comes and selects one from a hundred images. Not to mention the human had to write the prompt, sometimes a very long and explicit one. I'd say that's enough human involvement to be able to use the image as his own.
Using several iconic photos as starting points, we asked ChatGPT for a detailed description of each image and then fed it to DALL·E 3 to create new images. The process was repeated two more times.
That made my day. I'm kind of worried though that the examples are hand-picked, and by default it doesn't look so nice. I would love, though, to know the algorithm behind this.
>Btw transparency for this now-viral thread: I didn’t just paste prompts into dall-e, I played with style (eg. cyberpunk, oil, etc) to keep it interesting and diverse
>If I had to quantify, I’d say I’d generate 2 or 3 batches (tweaking prompt) before choosing my fav two pics, each batch outputs 20 images (two tabs 10 per), so prob technically cherry picked 2 out of 60. That said usually other 58 weren’t really broken, just boring / bit less fun
reply