May 9, 2024 - Business

Reddit CEO: We want "clear terms" for using our public data

photo of a man in a suit

Photo: David Paul Morris/Bloomberg via Getty Images

Reddit wants anyone looking to use its public data to make a deal with the company.

  • "We're going to stay open, but to crawl Reddit or have access to reading content, you need to have some sort of agreement," CEO Steve Huffman told reporters Wednesday afternoon.

Why it matters: Publicly available data is becoming increasingly integral to building certain kinds of new AI businesses, such as ChatGPT or Claude.

Zoom out: Platforms and publishers that host a large amount of that content are racing to protect themselves from having their data siphoned off without adequate compensation, Axios' Sara Fischer has reported.

Zoom in: Reddit published its first-ever Public Content Policy on Thursday.

  • The intent is to lay out how the company thinks about its user-generated content and to outline boundaries of its use by external platforms for AI and other purposes.
  • "Reddit believes in an open internet, but not the misuse of public content," the policy states.

What they're saying: Commercial entities should have to pay for data access through "bespoke" arrangements that resemble M&A deals, Huffman said.

  • Businesses would also have to agree not to use Reddit data or content to do things like building a Reddit competitor, constructing user identities for background checks, archiving user content that's been deleted and training AI that is used to generate spam.
  • For researchers or platforms like the Internet Archive, data access may be free but there will be guardrails, Huffman said.

Between the lines: Reddit isn't against having its content used for training AI — but it must be done on "on clear terms," according to Huffman.

  • "We're only doing agreements with people that we believe will be collaborative partners."

The intrigue: Huffman said he's not yet ready to name some of the bad actors that he sees as being unethical with handling data.

  • "I look forward to that day. ... And I will happily tell our friends of the FTC who those people are."

Hope's thought bubble: Reddit is among the first large social platforms to lay out its thinking on derivations of its user-generated content.

  • This is as much a message to businesses seeking to rely on Reddit's data as it is for people who are worried about how their posts and information will be used in an age of AI.

What we're watching: Though Reddit would "rather do deals than not," Huffman doesn't expect revenue from commercial agreements to be the company's "largest business model."

  • "This doesn't make or break Reddit," said Huffman.
  • In its first earnings report as a publicly traded company, Reddit said that revenue from advertising grew 39% year-over-year to $222.7 million and made up 92% of its overall sales.
  • The line item for the commercial data category, currently under "Other," went from close to nothing to $20 million in Q1, Huffman noted.
Go deeper