Stories

Text is the next horizon for deepfakes

Illustration: Rebecca Zisser/Axios

Researchers have broadened the controversial technology called "deepfakes" — AI-generated media that experts fear could roil coming elections by convincingly depicting people saying or doing things they never did.

Driving the news: A new computer program, created at OpenAI, the San Francisco AI lab, is the latest front in deepfakes, producing remarkably human-sounding prose that opens the prospect of fake news circulated at industrial scale.

  • As we have previously reported, existing technology can already create fake video, audio, and images — AI forgeries that together could transform the misinformation industry.

Details: The new OpenAI program is like a supercharged autocomplete — a system like Google uses to guess the next words in your search, except for whole blocks of text. Write the first sentence of a sci-fi story and the computer does the rest. Begin a news article and the computer completes it.

  • The quality varies. Sometimes, the text is a bit confused — but when it gets it right, the results are stunning. It has enormous potential for fiction or screenwriting.
  • But it could also be used for dystopian ends — huge, coordinated onslaughts of racist invective, fake news stories and made-up Amazon reviews.
  • "There are probably amazingly imaginative malicious things you could do with this technology," says Kristian Hammond, a Northwestern professor and CEO of AI company Narrative Science.

Why OpenAI created the program: The lab says it aims to create a general model for language — a program that can do what humans can (speak, understand, summarize, answer questions and translate) without being specifically trained to do each.

  • In terms of what can go wrong, this is ultimately a dual-use question — AI researchers are working on this technology for one reason, but bad players can use it for another.
  • "I think you have to ask yourself what systems like this might be capable of in several years," says Jack Clark, OpenAI's policy director. The quality of the program's writing is likely to keep improving as more data and computing power is added to the mix, he says.

How it works: The program "writes" by choosing the best next word based on both the human-written prompt and an enormous database of text it has read on the internet.

  • One reason the program is so convincing is that it was trained with enormous amounts of computing power and data — resources out of reach of many less wealthy research organizations.
  • An important point: The AI writer can only make stuff up. It can't tell the difference between a fact and a lie, which is part of what makes it volatile. Figuring out how to "teach" it what's true remains a huge challenge, says Hammond.

Because of how this technology could be abused, OpenAI announced that it will not release the program that generates the most convincing-sounding text. (Next week, we will return to this aspect of the story.)

The big question: How dangerous are text deepfakes?

  • AI-generated images have reached a level where they're often indistinguishable from real photographs. That's not true for generated text, which can sometimes be incoherent.
  • One new door that AI-generated text opens: "Conversational," rather than "broadcast" deepfakes, says Aviv Ovadya, a misinformation researcher who founded the non-profit Thoughtful Technology Project. It could be used to influence people one-on-one on a massive scale, rather than distributing a small number of forgeries widely.

Go deeper: AI wrote this story