Feb 4, 2020 - Technology

Twitter sets high bar for taking down deepfakes

Kyle Daly

Photo illustration: Omar Marques/SOPA Images/LightRocket via Getty Images

Twitter on Tuesday announced a new policy aimed at discouraging the spread of deepfakes and other manipulated media, but the service will only ban content that threatens people's safety, rights or privacy.

Why it matters: Tech platforms are under pressure to stanch the flow of political misinformation, including faked videos and imagery. Twitter's approach, which covers a wide range of material but sets narrow criteria for deletion, is unlikely to satisfy critics or politicians like Joe Biden and Nancy Pelosi — who have both slammed platforms for allowing manipulated videos of them to spread.

Details: Starting next month, Twitter will be working to identify media that, according to a Tuesday blog post, has been "significantly and deceptively altered or fabricated."

That includes deepfakes — digital forgeries that use AI to generate fake footage — as well as media that's been more crudely manipulated, such as by cropping out part of a video, changing its speed or dubbing in different audio.

Yes, but: The company only expects to delete manipulated content that's shared with the intention of deceiving people and, crucially, that's likely to cause harm. Per the blog post, examples of potential harm include:

Threats to the physical safety of a person or group.
Risk of mass violence or widespread civil unrest.
Threats to the privacy or ability of a person or group to freely express themselves or participate in civic events, such as stalking or unwanted and obsessive attention; targeted content that includes tropes, epithets or material that aims to silence someone; voter suppression or intimidation.

Manipulated media that doesn't fit all criteria for removal may be:

Labeled as misleading.
Affixed with a warning that people see if they try to like or retweet it.
Get algorithmically downranked so that, for instance, it doesn't show up in users' content recommendations.

For the record: Those criteria mean the viral video of Pelosi that had been slowed down to make her seem drunk would be labeled but not removed under the new policy, Twitter's head of site integrity Yoel Roth said on a press call.

The bottom line: Twitter is going noticeably broader with its manipulated media rules than Facebook, which announced its own policy last month.

Twitter's ban covers deceptively manipulated content, whether AI-generated "deepfake" or cruder "cheapfake," that's likely to cause harm.
Facebook's ban is specifically limited to misleading videos that have been manipulated by AI- or machine-learning-based tools. Cruder fakery could still get flagged and fact-checked under Facebook’s overall misinformation policy.

Yet critics of how tech has handled misinformation likely won't be satisfied, since the hard ban will probably rarely apply.

What's next: Twitter will start enforcing the policy March 5. To identify manipulated media, it will draw on assistance from crowd-sourced content reports as well as outside partners.

Partners that offered feedback as Twitter was developing the policy were Witness; Paul Barrett, deputy director of NYU Stern Center for Business and Human Rights; and the Oxford Reuters Institute, a Twitter spokesperson told Axios.

Go deeper:

Editor’s note: This story has been corrected to reflect that the outside parties gave feedback on the policy, not that they are launch partners. It has also been further updated to lay out more clearly the distinctions between Twitter's and Facebook's policies.

Add Axios on Google

Twitter sets high bar for taking down deepfakes

What to read next