Reddit files a lawsuit against Anthropic, accusing the AI firm of using its content without permission to train language models
On Wednesday, a complaint filed in a Northern California court alleges that Anthropic uses the site’s data to train AI models without a formal licensing agreement. Reddit is suing for this violation.
The complaint filed by Reddit alleges that Anthropic’s illicit use of the site’s data for commercial purposes was unlawful, and it also asserts that the AI startup violated Reddit’s user agreement.
Reddit’s lawsuit marks the first time a major technology company has taken legal action against an AI model provider for its training data practices. This action joins many publishers who have sued technology companies on similar grounds.
The New York Times has initiated legal proceedings against Microsoft and OpenAI for conducting training sessions on its news articles without compensation or authorization.
In the interim, Sarah Silverman and other authors of books have filed a lawsuit against Meta for training AI models in their works without their consent. Similar allegations have been made by music publishers and artists against AI audio, video, and image-generation startups, alleging the misuse of their content.

Ben Lee, Reddit’s chief legal officer, in a statement to TechCrunch, said:
“We will not tolerate profit-seeking entities like Anthropic commercially exploiting Reddit content for billions of dollars without any return for redditors or respect for their privacy,”
It is worth noting that Reddit has entered into agreements with other AI model providers, such as OpenAI and Google, that enable these companies to train AI models on Reddit’s data and incorporate the site’s posts into their respective AI chatbots’ responses.
Nevertheless, Reddit asserts in the filing that it is subject to specific provisions that safeguard the privacy and interests of its users, including OpenAI and Google.
Sam Altman, the CEO of OpenAI, is the third-largest shareholder in Reddit, with an 8.7% stake. He was previously a member of the company’s board of directors.

In the filing, Reddit asserts that it contacted Anthropic and explicitly stated that the AI startup lacked the authorization to extract or utilize Reddit’s Content. Nevertheless, Reddit asserts that Anthropic “refused to engage.”
In an email to TechCrunch, Anthropic spokesperson Danielle Ghighlieri stated, “We disagree with Reddit’s claims and will defend ourselves vigorously,”

In its complaint, Reddit asserts that Anthropic’s scraper algorithms disregarded the social network’s robots.txt files, a standard that instructs automated systems not to crawl websites.
Anthropic’s bots continued to scavenge the platform more than 100,000 times after Anthropic claimed to block them from scraping Reddit in 2024, according to the online community platform’s allegations.
Reddit requests that Anthropic pay compensatory damages and restitution for the amount by which Anthropic has been enriched by harvesting Reddit’s content. Reddit also seeks an injunction that would prevent Anthropic from utilizing Reddit’s content in the future.