Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

That's exactly right, he specified a style with each and cherrypicked out of 40-60 pictures: https://twitter.com/nickcammarata/status/1512119623315075081

>Btw transparency for this now-viral thread: I didn’t just paste prompts into dall-e, I played with style (eg. cyberpunk, oil, etc) to keep it interesting and diverse

>If I had to quantify, I’d say I’d generate 2 or 3 batches (tweaking prompt) before choosing my fav two pics, each batch outputs 20 images (two tabs 10 per), so prob technically cherry picked 2 out of 60. That said usually other 58 weren’t really broken, just boring / bit less fun



sort by: page size:

> you have to think of the underlying labeled text-to-image sets as paint colors to mix, and prepare a palette accordingly.

Very insightful tip on how to harness the "creativity" of Dall-E and the like.

I see how the phrase "king of belgium" was too vague for Dall-E, so it didn't produce anything recognizable - but changing the words into known details, like "banker" and "salt and pepper hair", worked effectively to generate concrete imagery.

Hilarious results. :)


interesting.. but my guess is it's using a big library of generated image and prompt pairs? So all its suggested prompts are right out of someones 'stable diffusion prompt cheatsheet.pdf' . That is to say overly outputting commonly known artists, and things like 'trending on deviant art'

It's an interesting set-up. Viewing other's images and seeing their exact prompts is just as entertaining as generating your own.

none and as an experienced user you should know that's it's not one shot and most of the time not even few shot... You can't compare cherry picked press images with few shots of a 5 second prompt. I don't know why you want to hype something up if you can't really compare it. It seems extremly attention grifting.

Just look at their cherry picks in this discord... https://discord.com/invite/pxewcvSvNx . It's overfitted on images with copyright (afghan girl) and doesn't show more "compositional power" at all most of the time ignoring half of the prompt.


aside from the examples in the thread, He was taking requests and i thought these were impressive

https://twitter.com/j_stonemountain/status/16987819625744466... (describing each panel of multiple comic book page accurately incl text)

recognizing English cursive

https://twitter.com/j_stonemountain/status/16990947849404212...


Interesting. I assume the pictures are all predone? Or is he typing away like crazy?

I think I had around 12 human-created images in my batch.

Did you read the article? The author says he used Photoshop on all of them..

Note that the guy who submitted the images to the contest created hundreds of images, and spent weeks finetuning the prompts and curating the results.

Midjourney/Stable Diffusion/DALLE are doing to Photoshop what it did to traditional drawing methods. But there's still a human in the loop.


And it's extremely funny since most of these text to image models were container Shutterstock watermarked images, to the point it thinks humans see the world with a big shutterstock watermark across it with certain prompts.

From the videos I have seen of it in use, there is no way the prompts had existing material. One prompt was “painting of a goat in the style of the Mona Lisa taking photos with an iPad”. And it spat out 10 images which showed exactly that.

I think I read that he actually mis-scales the images on purpose because of a higher click-thru rate.

It seems like the model was trained on images that were "trending on artstation", see here: https://www.reddit.com/r/DiscoDiffusion/comments/u01cnw/how_...

So it might be that all images have a distinctive look, or influences of it, and this phrase is becoming kind of an inside joke/meme.


Wanted to give it a try just for fun, using the same prompts, base model and parameters (as far as I can tell), and the first 5 images that were created... will probably haunt me in my dreams tonight.

I don't know if it was me misconfiguring it, or if the images in post were really cherry-picked.


Yes! So now you can guess why he was loading a specific different parameter file for each picture! Translation: someone tried many different parameters for each image in this demo and them manually selected the ones that make a better result. This might still be useful if there is a good UI for users to do the same.

But then a human comes and selects one from a hundred images. Not to mention the human had to write the prompt, sometimes a very long and explicit one. I'd say that's enough human involvement to be able to use the image as his own.

Using several iconic photos as starting points, we asked ChatGPT for a detailed description of each image and then fed it to DALL·E 3 to create new images. The process was repeated two more times.

That made my day. I'm kind of worried though that the examples are hand-picked, and by default it doesn't look so nice. I would love, though, to know the algorithm behind this.

From his generative collection these ones really stand out:

https://img.inconvergent.net/img/gen/20170523-193637-305712-...

https://img.inconvergent.net/img/gen/20170520-230701-136920-...

It's almost hard to believe they are computer generated. They just seem so organic.

next

Legal | privacy