Oct 27, 2023 - Technology

"Poison pill" could sabotage AI trained with unlicensed images

Illustration: Natalie Peeples/Axios

Artists looking to protect their works from AI models may soon be able add invisible pixels to their art that could thwart image-generating models seeking to copy and train on that art.

The big picture: The goal of this "poison pill" is to trick an AI model during its training phase into cataloging an image as something other than it is, causing the model to generate useless results.

Why it matters: The tool (called "Nightshade") gives creators a way to penalize AI developers who try to use their work without permission, attribution and compensation — without resorting to a lawsuit.

  • Ben Zhao, a University of Chicago professor and the lead developer of Nightshade, told Axios that it takes less than a few hundred poisoned images to severely damage new versions of a model such as DALL-E, Midjourney or Stable Diffusion.
  • Zhao proposes Nightshade "as a last defense for content creators against web scrapers that ignore opt-out/do-not-crawl directives."

How it works: Nightshade's creators say their "prompt-specific poisoning attacks" can undermine how AI models categorize specific images. They have outlined their work in a paper that is yet to be peer reviewed.

  • They frame their innovation as a counter-offensive against AI developers scraping the open internet for content.
  • "Data poisoning attacks manipulate training data to introduce unexpected behaviors into machine learning models at training time ... effectively disabling its ability to generate meaningful images," according to the research paper.
  • The effect is for animals to be labeled as plants, or buildings as flowers, and for these errors to create further problems in the model's general features.
  • Nightshade will be open to all developers, and also integrated be into an existing tool called Glaze.

Yes, but: Even if Nightshade works as efficiently as its creators claim, only new versions of an existing model will be affected by the "poison pill."

  • "It works at training time and destabilizes it [the model] for good. Of course the model trainers can just revert to an older model, but it does make it challenging for them to build new models," Zhao said.
  • Zhao concedes Nightshade will likely trigger a "cat and mouse" game with AI developers that seek to patch the vulnerabilities in their systems.
  • Any tool powerful enough to disrupt an algorithm could prove dangerous if it falls into the wrong hands.
  • Bad actors also love to poison algorithms to turn them racist. Zhou pins blame for any future offensive material on the model developers. "They are responsible for what their models produce," he told Axios.

What we're watching: Whether this innovation and a string of lawsuits by creators against AI developers pushes more AI companies towards licensing of images and other data.

  • Zhao's overall goal, he said, is to impose a price on what he sees as unethical behavior by some AI developers.

The other side: Watermarking is an alternative to the "cloaking" that is offered by Glaze, but researchers say that watermarks are easily broken, including watermarks invisible to the eye.

  • "We don't have any reliable watermarking at this point," University of Maryland computer science professor Soheil Feizi told Wired.
  • Getty on Sept. 25 revealed that it has partnered with AI hardware and software vendor Nvidia to launch Generative AI by Getty Images, which uses only images from Getty's library.
Go deeper