Apr 3, 2019 - Technology

An AI-generated Ellen DeGeneres voice is the future of deepfakes

Kaveh Waddell

Illustration of Ellen DeGeneres glitching — Illustration: Sarah Grillo/Axios

If you want to make a video deepfake, you can download free software and create it yourself. Someone with a bit of savvy and a chunk of time can churn out side-splitters like this one. Not so for audio deepfakes — at least not yet. Good synthetic audio is still the domain of startups, Big Tech and academic research.

What's happening: Pindrop, the audio biometrics company, is developing synthetic voices in order to train its own defenses to detect them. Vijay Balasubramaniyan, Pindrop's CEO, shared several fake voices with Axios.

Listen to one of the voices.
The AI-generated voice is clearly mimicking Ellen DeGeneres — but it's not quite right.

How it works: Pindrop's system listened to countless hours of DeGeneres talking in real life — mostly narrating her own audiobooks — and then used a cutting-edge AI technique to develop an impersonator, improving the synthetic voice until the system could no longer tell it apart from the real thing. Now, anyone can type a phrase into the system and have it read out in DeGeneres' voice.

Axios listened to this and several other Pindrop-generated voices. Each captured the real speakers' idiosyncrasies, but they were exposed by their robotic-sounding pace and cadence. To this, Balasubramaniyan replied:

"You are actually identifying all the things it takes to start mimicking a million years of human evolution in voice. Our synthesis systems do a good job at synthesizing a voice but not yet things like cadence, emotion and flair, which are all active areas of research."

But that doesn't mean these imperfect fakes couldn't cause some mischief now. Imagine if you were already expecting to receive a phone call from someone. You probably wouldn't be too suspicious if he sounded a bit robotic or stilted if he told you he was sick and driving through a tunnel.

"We're communicating through this phone system that has a lot of security issues," says Aviv Ovadya, a misinformation researcher and founder of the Thoughtful Technology Project.
This is how Charlie Warzel, formerly of BuzzFeed News, tricked his own mother into falling for an AI mimicry of his voice.

Go deeper: Defending against audio deepfakes before it's too late

Add Axios on Google

An AI-generated Ellen DeGeneres voice is the future of deepfakes

What to read next