Mar 29, 2024 - Technology

OpenAI reveals artificial intelligence tool to re-create human voices

OpenAI CEO Sam Altman in Tel Aviv in June 2023.

OpenAI CEO Sam Altman in Tel Aviv in June 2023. Photo: Jack Guez/AFP via Getty Images

OpenAI said on Friday it's allowed a small number of businesses to test a new tool that can re-create a person's voice from just a 15-second recording.

Why it matters: The company said it is taking "a cautious and informed approach" to releasing the program, called Voice Engine, more broadly given the high risk of abuse presented by synthetic voice generators.

How it works: Based on the 15-second recording, the program can create a "emotive and realistic" natural-sounding voice that closely resembles the original speaker.

  • This synthetic voice can then be used to read text inputs, even if the text isn't in the original speaker's native language.

Case in point: In one example offered by the company, an English speaker's voice was translated into Spanish, Mandarin, German, French and Japanese while preserving the speaker's native accent.

  • OpenAI said Voice Engine has so far been used to provide reading assistance to nonreaders, to translate content, and to help people who are nonverbal.

What they're saying: It said the program has already been used in its text-to-speech application and its ChatGPT Voice and Read Aloud tool.

  • "We hope to start a dialogue on the responsible deployment of synthetic voices, and how society can adapt to these new capabilities," the company said.
  • "Based on these conversations and the results of these small scale tests, we will make a more informed decision about whether and how to deploy this technology at scale."

The big picture: Similar programs have been developed and are currently available — and have been abused in scam calls, phishing schemes and other forms of fraud.

  • There's also concern that such programs will be used to ramp up political misinformation. Already, similar programs have been used in fake robocalls that used President Biden's voice to discourage voting.

Our thought bubble, Axios' Ina Fried: While there are many uses for cloning voices, including preserving speech for those at risk of losing it due to ALS and other diseases, the ease of doing so leads to all kinds of risks of fraud and misinformation.

  • And while OpenAI may not yet be releasing its technology broadly, there will doubtlessly be other similar options made available with fewer safeguards.
  • Apple offers a voice preservation option that allows people to save their own voice but keeps them in control. Others, like HeyGen, require video consent before allowing people to create an avatar that speaks and looks like someone.

Go deeper: ChatGPT will create a digital memory to help personalize its responses

Go deeper