Talking to AI Might Be the Most Important Skill of This Century
A product race is under way in the world of artificial intelligence. Just this week, Google announced plans to release Bard, a search chatbot based on its proprietary large language model; yesterday, Microsoft held an event unveiling a next-generation web browser with a supercharged Bing interface powered by ChatGPT. Though most big tech companies have been quietly developing their own generative-AI tools for years, these giants are scrambling to demonstrate their chops after the public release and runaway adoption of OpenAI’s ChatGPT, which has accumulated more than 30 million users in two months.
OpenAI’s success is an apparent signal to tech leaders that deep-learning networks are the next frontier of the commercial internet. AI evangelists will similarly tell you that generative AI is destined to become the overlay for not only search engines, but also creative work, busywork, memo writing, research, homework, sketching, outlining, storyboarding, and teaching. It will, in this telling, remake and reimagine the world. At present, sorting the hype from genuine enthusiasm is difficult, but given the billions of dollars being funneled into this technology, it’s worth asking, in ways large and small: What does the world look like if the evangelists are right? If this AI paradigm shift arrives, one vital skill of the 21st century could be effectively talking to machines. And for now, that process involves writing—or, in tech vernacular, engineering—prompts.
Image-generating models such as DALL-E 2 and Midjourney and text-generation tools like ChatGPT market themselves as a means for creation. But in order to create, one must know how to guide the machines to a desired outcome. Asking ChatGPT to write a five-paragraph book report about Animal Farm will yield forgettable, even inaccurate results. But writing the introductory paragraph to the book report yourself and asking the tool to complete the essay will feed the machine valuable context. Better yet, instruct the machine, “Write a five-paragraph book report at a college level with elegant prose that draws on the history of the satirical allegorical novel Animal Farm. Reference Orwell’s ‘Why I Write’ while explaining the author’s stylistic choices in the novel.” It will yield a far more sophisticated and convincing output.
Good prompts aren’t just specific. They seem to reflect a deeper understanding of the model you are trying to manipulate. One way to think of prompt trial and error is as an attempt to glean what information the model is pulling from and how the AI organizes and indexes the information at its disposal. It’s informed guesswork.
Despite making a living as a writer, I’m usually far too vague when instructing DALL-E 2 and Midjourney. When I had my 8-year-old nephew play with Midjourney this summer, his imagination conjured delightful scenes such as a flea surfing on a tsunami wave fighting a giant wasp, but, even together, we couldn’t come up with the details for our prompts to bring his specific vision to life. First, his flea didn’t look cartoonish enough; then, the tweaks I made turned the whole thing hyperrealistic and too scary for him. He lacked the stylistic language to talk to the model, and apparently, so did I.
To help people like me and my nephew, a cottage industry has already sprung up around those who can speak to the machines. On PromptBase, a marketplace for prompt engineers, you can purchase a few lines of text to feed into any number of generative-AI models. Some of the most popular prompts on the service are for generating “cute 3D renders of emojis in a clay style” with DALL-E 2 or creating sleek, modern logos via Midjourney. There are prompts that promise to generate new sports-team logos, and text hacks with names like Sentence Expander. For $3.99, Book Summarizer promises a prompt that will help “extract the essential information and takeaways from a book.” PromptBase’s sixth-most-popular seller, a prompt creator from Spain who goes by Imagineer, told me that prompt engineering is still a side hustle, having earned them just over 800 euros since September. “For me, it’s almost like a game,” they told me. “I like to think of prompts as little treasures.”
Imagineer’s prompt-writing process is informed by knowledge of design, illustration, and photography. When I asked why they thought they were good at prompt writing, they suggested it was a blend of natural skill and strategy. “I realized that I was better at talking to Midjourney than other people,” they said. And Midjourney allowed them to generate great results with less effort than when using DALL-E 2 and Stable Diffusion, another competitor. But Imagineer said that the most crucial element of a successful query is iteration. A good prompt “gives consistent and predictable results, and you get this when you generate a lot of images and see the variations when you alter some words or parameters,” they told me over email.
Subject-area expertise is also essential for text tools. Dan Shipper, an entrepreneur and writer, has been using ChatGPT since its release in November to help write his blog posts, which are now primarily about the future of AI tools. When he needs to describe a concept (say, the philosophical theory of utilitarianism, for a post about the disgraced cryptocurrency CEO Sam Bankman-Fried), he’ll ask ChatGPT to summarize the key points of the movement in a few sentences for a general audience. Once the machine furnishes the text, Shipper reviews it, checks it to make sure it’s accurate, and then spruces it up with his own rhetorical flourishes. “It allows me to skip a step—but only if I know what I’m talking about so I can write a good prompt and then fact-check the output,” he told me.
Shipper compared prompting ChatGPT to managing a bright and eager junior employee: The text tool is enthusiastic and skilled, but also inexperienced and thus more likely to make subtle, but crucial, mistakes. It’s also great at bullshitting when it doesn’t have the answer. Taste and experience, qualities that Shipper attributes to a good manager, are required to create a successful prompt. The day we spoke, Shipper told me he’d gotten ChatGPT to build him an impressive, thorough outline for a long post he was working on. “I wrote a bunch of bullet points and said, Here are all the different things and quotes and ideas and phrases I’ve amassed. Then underneath that, I wrote, Can you please format this into an outline of an essay?” The more work Shipper does in fine-tuning the prompt, the better the output, he said.
Sometimes, the prompt writing itself holds a specific kind of delight. Meg Conley, a writer who uses AI tools in her spare time, sees prompt engineering as a challenge akin to crafting a persuasive essay: “Very difficult. Mostly failure. And sheer joy when the words come together to make something that looks a little like the world you see in your head,” she wrote on Twitter back in November. It also holds a special personal thrill: Conley has aphantasia, which means that she has trouble visualizing images. After the release of Midjourney, she frequently stayed up late describing things from her imagination and honing her prompts until the resulting image felt right.
Most important, she told me, is knowing the model you’re speaking to. Each tool is built and trained differently, giving it unique aesthetics and vernacular—like how people who share a language have regional dialects and cultural quirks. “In the way that prose writing differs from technical or academic writing, there are different ways of marshaling the language depending on your audience,” she told me. “I’ve seen people who are really good at DALL-E 2, which seems to reward an ability to draw on references and high- and low-culture mash-ups. But the way I conceptualize the world is more along the lines of how Midjourney’s model works,” she said. Over time, Conley has familiarized herself with the model’s order of operations. “Something I’ve learned is the importance of the weight of a prompt,” she told me. “In Midjourney, if you type the word girl before the adjective red, it’ll focus on the girl more than the color red. With longer prompts, it’s like a puzzle, and you learn which terms to give more weight.”
Already, some teachers are banking on the notion that prompt writing is a skill their students might need in their careers. Ethan Mollick, a professor at the University of Pennsylvania who teaches about innovation and entrepreneurship, has revamped his syllabus since ChatGPT was released to the public. In one of his new lessons, Mollick asks his class to imagine ChatGPT as a student and to teach the chatbot by prompting it to write an essay about a particular class concept. Like a professor during office hours, the students must help the AI refine its essay until it appears to have sufficient mastery of the subject. Mollick hopes that the exercise will help the students learn by explaining, with the added benefit of teaching them to write deft prompts.
To hear Mollick tell it, prompt engineering lies somewhere between linguistics and problem solving. “Prompting is programming in prose with weirdness and stochastic results,” he told me. “I think that good prompting likely rewards divergent thinkers who find ways to experiment quickly. I think it rewards people with deep curiosities.”
It also rewards some deeper technical knowledge. One striking image I found on Midjourney’s Discord server was generated with the following, painfully detailed prompt:
in the style of Metaphysical painting, colored pencil drawing Smooth Shading & Blending, a sunrise reflects in a pond in the deep woods, a willow trees boughs hang over the edge of the pond, moody, intense emotions, deep perspective, natural lighting, Hyperdetailed, super High Contrast, intricate details, photography, raytracing, octane render, unreal engine --ar 3:2 --s 999 --chaos 50 --v 4 --v 4 -
More than just specificity, good prompts tend to reveal an awareness of the medium’s abilities that the user is trying to replicate. Octane render and unreal engine are digital animation tools that produce 3-D computer graphics. Inputs like -- v4 are instructions to Midjourney to use the model’s newer, more powerful, and experimental version. Some of the best photorealistic-image prompts ask the model to imitate a specific camera or lens type; others demonstrate a working knowledge of art history or a particular artistic style. It’s reminiscent of the early days of search, when experts who could navigate Boolean operators, authors, keywords, sources, and date-range searches could unlock better results.
Search engines, of course, are no longer as demanding. In order to drive and cater to mass adoption, Google’s tools became more powerful, making it easier to get a high-quality result with a clumsy or simple query. Mollick suggested that prompt engineering might simply be a placeholder—a rudimentary way for us to interact with AI until they can synthesize what we want from bare-bones prompts or other, unknown means.
He could be right. Those who have seen early tests of OpenAI’s GPT-4 text models (which are not public) speak of it like something out of science fiction and suggest that the next leap will render the old tool obsolete. And yesterday, Microsoft unveiled an interface that, the company said, users will eventually talk to like a personal assistant. Instead of searching How big is a Honda Odyssey? and IKEA Klippan loveseat dimensions, you might ask the chatbot to solve an entire problem for you: Will the IKEA Klippan loveseat fit into my 2019 Honda Odyssey if I fold down the seats? Prompt engineering might just be a bridge to get us to the brave new world of whatever generative-media landscape comes next, but for now, it’s difficult to know how much to believe the hype.
Until the paradigm shifts, I remain drawn to AI prompts, which are usually far more intriguing than the outputs they yield. When people share AI-generated art or text, they frequently do so alongside the string of commands that brought it to life. Traveling back and forth between the instructions and the end result is revealing, even intimate. It’s a bit like being granted access into a person’s brain to see how they piece together disparate bits of knowledge, how they reason through a problem, or how they employ their creativity to produce something unexpected.
Like writing and coding before it, prompt engineering is an emergent form of thinking. It lies somewhere between conversation and query, between programming and prose. It is the one part of this fast-changing, uncertain future that feels distinctly human.