AI expert Scott Aaronson on fighting plagiarism
Add Axios as your preferred source to
see more of our stories on Google.

Photo illustration: Axios Visuals; Photo: Courtesy of University of Texas
As concerns mount about artificial intelligence, we caught up with University of Texas computer science professor Scott Aaronson, who has proposed an elegantly simple solution: digital watermarking.
- Watermarking has long been a way to distinguish the authentic from the counterfeit, especially with money.
Why it matters: The line between authentic and fake is harder and harder to tell in our AI age.
Driving the news: Google in October published a paper describing a new method for watermarking text generated by large language models (LLMs) like its Gemini.
- This kind of tool could be useful in combating misinformation online or plagiarism in education.
The big picture: The method is based on one originally proposed by Aaronson.
- Our condensed and edited conversation is part of our new (and occasional) "Back to school" series, in which we talk with an academic about their research.
Describe the problem you're trying to solve.
SA: "Academic cheating is probably the most prevalent use of large language models today, but it's not the most serious. More serious things we might worry about include Russian intelligence wanting to fill every discussion with pro-Kremlin talking points, for example."
How fast is the problem developing?
SA: "If you have a not very good large language model, you can tell that its outputs are LLM-generated, just by staring at them. Even good ones today love bulleted lists, love the word "delve" and love nice, tidy, concluding paragraphs. But as LLMs get better and better that could get harder and harder.
- "Professors and teachers get term papers turned in where they don't know whether an LLM completely wrote it or wrote it parts of it, and they're desperately seeking some tool to figure it out."
How does a digital watermark work?
SA: "Think of it as a statistical signal inserted into a series of tokens — words, for example, that's the output by the LLM. It's not detectable by the ordinary user — only if you know what to look for.
- "The basic way to insert a watermark is to use the fact that all large language models inherently involve randomness. The LLM is computing the probability distribution of the next word and is repeatedly sampling from that distribution.'The ball rolls down the.' 90 percent of time going to be 'hill.' Instead of picking the next word randomly, you pick one pseudo randomly."
What's the major challenge to making this happen?
SA: "Every company is leery of doing this unilaterally — they worry that some fraction of customers will leave for competing LLMs."
Could you game the watermark?
SA: "I don't think there's any perfect solution. If someone wants to remove a watermark, they could ask ChatGPT for something in French, and then put it in translation. But being a professor, I know most academic cheaters are pretty low effort, otherwise they'd do the work.
- "Every possible detection method has ways to work around it. But it would be one more tool in the digital arsenal."
Who should have access to the detection tool?
SA: "The simplest way to roll this out is OpenAI would provide a webservice in which you paste text in a text box, and they can tell you if the watermark is present with 99% probability.
- "But OpenAI from the beginning was leery of doing that because once you do that an attacker might also have access to the tool."
Have you been surprised at how quickly lawmakers and the public have grown concerned about AI?
SA: "One of the surprising things is that 10 years ago, AI safety seemed like the most niche thing, was never going to be a political winner and it was something people were going to laugh at and think it's the nerds terrified by the Terminator.
- "Watermarking is not going to save the world from the Terminator, but it's at least a start."
