Reddit Sues Anthropic Over Unauthorized AI Training Data Use

Reddit has initiated legal action against the artificial intelligence company Anthropic, alleging that the firm has been scraping and utilizing Reddit's content without authorization to train its Claude AI model. The lawsuit, filed in a U.S. federal court, asserts that Anthropic violated Reddit's user agreement by continuing to access Reddit servers, including over 100,000 instances of access after publicly stating that it had ceased such activities in July 2024.
Reddit is seeking damages, restitution, and a court order to prevent Anthropic from using any Reddit-derived data in its products. This includes blocking the licensing or profiting from any AI programs trained on Reddit content. The social media platform has accused Anthropic of having a "two-faced" approach, presenting itself as a responsible player in the AI industry while disregarding rules that interfere with its profit-making efforts.
The lawsuit highlights a broader controversy surrounding the training of large language models. Since the introduction of OpenAI's ChatGPT, there have been escalating concerns about the use of both copyrighted and user-generated materials in AI development. This issue has led to several lawsuits, including a high-profile case brought by The New York Times against OpenAI and Microsoft in 2023. Other plaintiffs include visual artists, authors, and record labels who claim their work was exploited without permission.
Anthropic is also facing additional legal challenges, including a lawsuit regarding its alleged use of copyrighted song lyrics and another from a group of authors who claim the company used pirated versions of their books as training materials. The tension has extended into the cultural arena, with artists expressing outrage over AI-generated imitations of their styles. Earlier this year, a trend of replicating the art style of the popular Japanese animation company Studio Ghibli raised concerns about copyright violations and artists losing out to AI programs trained on their own work.
Ask Aime: What impact will Anthropic's lawsuit have on AI's future development in the U.S.?
In a submission to the UK Parliament last year, OpenAI acknowledged using copyrighted content in training, arguing that it would be "impossible" to develop leading AI systems without it. The company maintains that such practices are lawful. A recent proposal in the UK to ease copyright law and allow the use of copyrighted materials for training large language models (LLMs) has faced criticism from prominent artists, including Elton John. Despite its stance on protecting users, Reddit itself has struck licensing deals with firms like OpenAI, Google, Sprinklr, and Cision to allow access to its content for training purposes, as long as Reddit is compensated.

Comments
No comments yet