��ý

Skip to content

Technology

AI illustrator draws imaginative pictures to go with text captions

By Chris Stokel-Walker

5 January 2021

AI-generated images — AI images generated from the text prompts “a baby daikon radish in a tutu walking a dog” and “an armchair in the shape of an avocado”
OpenAI

A neural network uses text captions to create outlandish images – such as armchairs in the shape of avocados – demonstrating it understands how language shapes visual culture.

OpenAI, an artificial intelligence company that , developed the neural network, which it calls DALL-E. It is a version of the company’s GPT-3 language model that can create expansive written works based on short text prompts, but DALL-E produces images instead.

“The world isn’t just text,” says Ilya Sutskever, co-founder of OpenAI. “Humans don’t just talk: we also see. A lot of important context comes from looking.”

DALL-E is trained using a set of images already associated with text prompts, and then uses what it learns to try to build an appropriate image when given a new text prompt.

Read more: An AI can generate photographs of people’s faces from line drawings

It does this by trying to understand the text prompt, then producing an appropriate image. It builds the image element-by-element based on what has been understood from the text. If it has been presented with parts of a pre-existing image alongside the text, it also considers the visual elements in that image.

“We can give the model a prompt, like ‘a pentagonal green clock’, and given the preceding [elements], the model is trying to predict the next one,” says Aditya Ramesh of OpenAI.

For instance, if given an image of the head of a T. rex, and the text prompt “a T. rex wearing a tuxedo”, DALL-E can draw the body of the T. rex underneath the head and add appropriate clothing.

The neural network, , can trip up on poorly worded prompts and struggles to position objects relative to each other – or to count.

Read more: AI taught to instantly transform objects in image-editing software

“The more concepts that a system is able to sensibly blend together, the more likely the AI system both understands the semantics of the request and can demonstrate that understanding creatively,” says Mark Riedl at the Georgia Institute of Technology in the US.

“I’m not really sure how to define what creativity is,” says Ramesh, who admits he was impressed with the range of images DALL-E produced.

The model produces 512 images for each prompt, which are then filtered using a separate computer model developed by OpenAI, called CLIP, into what CLIP believes are the 32 “best” results.

CLIP is trained on 400 million images available online. “We find image-text pairs across the internet and train a system to predict which pieces of text will be paired with which images,” says Alec Radford of OpenAI, who developed CLIP.

“This is really impressive work,” says Serge Belongie at Cornell University, New York. He says further work is required to look at the ethical implications of such a model, such as the risk of creating completely faked images, for example ones involving real people.

Effie Le Moignan at Newcastle University, UK, also calls the work impressive. “But the thing with natural language is although it’s clever, it’s very cultural and context-appropriate,” she says.

For instance, Le Moignan wonders whether DALL-E, confronted by a request to produce an image of Admiral Nelson wearing gold lamé pants, would put the military hero in leggings or underpants – potential evidence of the gap between British and American English.

Topics:

artificial intelligence

Sign up to our weekly newsletter

Receive a weekly dose of discovery in your inbox. We'll also keep you up to date with New ��ý events and special offers.

Sign up

More from New ��ý

Explore the latest news, articles and features

TORONTO, ON - APRIL 30: Saudi Arabian citizen Humanoid Robot Sophia is seen during the Discovery exhibition on April 30, 2018 in Toronto, Canada. (Phoo by Yu Ruidong/China News Service/Visual China Group via Getty Images)

Technology

New ��ý recommends an excellent look at the future of work

Culture

The best new popular science books of June 2026

Comment

The best new popular science books of June 2026

Culture

A golden age of maths is dawning and mathematicians are freaking out

Mathematics

A golden age of maths is dawning and mathematicians are freaking out

Features

Technology

How human error became a weapon against large language models

Comment

1

The world's fastest spider tops 3.5 metres per second

2

Babies are born with the neural foundations for maths

3

Where, when and how to watch the 2026 solar eclipse

4

The best new science-fiction novels published in July 2026

5

Slowdown of AMOC ocean current may be gradual and reversible

6

The best sci-fi novel in 2026 so far – plus 6 other great reads

7

A type of fibre that stimulates GLP-1 release approved for use in food

8

US government wants to have a useful quantum computer by 2028

9

This physicist is hunting for the biggest black hole in the universe

10

I’m the first person whose life was saved by CRISPR base editing