Senate panel hears allegations Meta, others stole IP for AI training


The Meta logo. Photo: Nikolas Kokovlis/NurPhoto via Getty Images
Company documents allege AI and tech leaders have been complicit in pirating protected intellectual property for training AI, per witness testimony the Senate Judiciary crime subcommittee heard on Wednesday.
The big picture: Authors and publishers have been scrambling to protect their copyrighted works from AI models as tech companies push for more flexibility to use such work.
Driving the news: Maxwell Pritt, a partner at law firm Boies Schiller Flexner LLP, shared allegations of what he calls "likely the largest domestic piracy of intellectual property in our nation's history" at the hearing.
- From Pritt's testimony: "That piracy includes hundreds of terabytes of data and many millions of works, including, for example, at least 12 books authored by members of this subcommittee."
- "At Meta, company documents show that Mark Zuckerberg himself made the call."
- "Company documents at Anthropic also show a blatant disregard for our copyright laws, preferring to pirate books to avoid or delay
the 'legal/practice/business slog,' as Anthropic's co-founder and CEO Dario Amodei put it."
Meta declined to comment. Anthropic and OpenAI did not immediately respond to requests for comment.
Context: Pritt, who's in charge of the firm's San Francisco office, is currently representing "authors, artists, and programmers in copyright infringement cases against AI companies including Meta, OpenAI, GitHub, and Midjourney," per his testimony.
- "Much of Meta's exploitation of pirated copyrighted works is now public through the efforts of my firm and our co-counsel on behalf of authors," his testimony reads.
- "Meta, OpenAI, Anthropic and others knowingly and intentionally pirated from illicit online marketplaces for financial gain and to seek a competitive advantage in AI," Pritt said at the hearing.
What they're saying: "For too long, AI companies and Big Tech have deceived the public. They have stolen copyrighted work and hurt content creators and the average Joe across America while making themselves rich," Sen. Josh Hawley, chair of the subcommittee, said in a statement to Axios ahead of the hearing.
- Companies have turned to "shadow libraries" and piracy networks to obtain datasets of copyrighted works, a subcommittee memo shared with Axios alleges.
- "For all of the talk about artificial intelligence and innovation, and the future that comes out of Silicon Valley, here's the truth that nobody wants to admit: AI companies are training their models on stolen material, period," Hawley said during the hearing.
Other witnesses testifying include author David Baldacci, Carnegie Mellon professor Michael Smith and law professors Bhamati Viswanathan and Edward Lee.