Have you ever seen Ronald McDonald performing open-heart surgery? Thanos looking for his mom in a Wal-mart? R2D2 getting baptized? These things never happened, but you can conjure an image in your mind. Our brain isn’t restricted to what we’ve seen before. One cluster of neurons remembers that ‘Ronald McDonald’ is a clown, and another pictures what ‘open-heart surgery’ looks like, and you can combine the two combine in your imagination. This is the type of task which is trivial to perform for humans, even as children. Our brains are masters at meme recognition and generation, because these skills have been vital to our survival as a species. You can describe what a tiger is – orange and black stripes, long fluffy tail, big teeth and claws – and even if I’ve never seen one before, I can recognize when I’m in danger. In modern day we’re usually not in danger from tigers, but we can hijack the same mechanism to picture a one eating a bowl of cereal. If you have a talent for art, you could draw your vision as cartoon mascot for Kellogg's Frosted Flakes and make a grrreat deal of money. This sort of creative task is something computers struggled with, and the generally accepted belief was that art would be the last thing on the list to get automated.
In January 2021, Open-AI released DALL-E, an AI system that brought us the closest we’ve been so far to putting picasso out of a job. It takes simple text and generates realistic images that have never existed before. It works by training an artificial version on our brain, on millions of images from the internet. One cluster of artificial neurons ‘remembers’ that Ronald McDonald is a clown, and another picturess what ‘open-heart surgery’ looks like, so if you ask for “Ronald McDonald conducting Open-Heart Surgery” it can combine the two. But it’s not limited to mere imagination: the image can be generated and rendered for you on your computer screen. Do you want to see it as a cartoon? A pencil sketch? In photo-realistic style? Changing the image is as simple as changing a few words in the prompt. You’re not limited to simple changes. Swap out Ronald McDonald for the Hamburgler. Make him play chess instead of perform surgery. Paint it in the style of Vincent Van Gogh. Any concept referenced in the training data is possible.
The model has over 12 billion parameters, which is small compared to the 100 trillion synapses in our brain, but it’s improving exponentially. It’s no substitute for a talented artist, but for many tasks no artistic human is available. Generating a logo for a small business. Programmatically making 1,000s of custom illustrations for a hobbyist blog. Creating artwork for a self-published book. There is of course also the possibility of abuse. Open AI is rolling DALL-E out slowly to its waitlist, as it monitors early adopters for vectors of abuse. It’s terms and conditions already forbid putting public figures in comprimising positions. The most useful and popular task is likely to be ideation. If every business executive is suddenly empowered to try lots of combinations of ideas until they get what they want, they can give a far better brief to a creative for the final production, saving design time and getting better results. Through blending memes that normally don’t go together, we can reach novel combinations nobody expected.
Name | Link | Type |
---|---|---|
11 of the weirdest DALL-E-generated images on the internet | Blog | |
DALL-E 2 | Reference | |
Generating custom photo-realistic faces using AI | Blog | |
How many neurons are in DALL-E? | Reference |