A California federal judge has ruled that three authors suing Anthropic for copyright infringement can represent writers nationwide whose books Anthropic allegedly pirated to train its AI system. The authors can bring a class action on behalf of all U.S. writers whose works Anthropic allegedly downloaded from pirate libraries LibGen and PiLiMi in 2021 and 2022. Anthropic may have illegally downloaded as many as 7 million books, making it liable for billions of dollars in damages if the authors' case is successful.
A California federal judge has ruled that three authors suing Anthropic for copyright infringement can represent writers nationwide whose books Anthropic allegedly pirated to train its AI system. The authors, who include author Neal Stephenson and science fiction writer Cory Doctorow, can bring a class action on behalf of all U.S. writers whose works Anthropic allegedly downloaded from pirate libraries LibGen and PiLiMi in 2021 and 2022. The lawsuit alleges that Anthropic may have illegally downloaded as many as 7 million books, potentially making it liable for billions of dollars in damages if the authors' case is successful.
The decision comes amidst growing concerns about the use of copyrighted materials in AI training data. Two recent district court opinions from the Northern District of California have addressed the use of copyrighted material in training large language models (LLMs). These decisions, while differing slightly in their analysis, have both concluded that using copyrighted content for training LLMs can be a fair use of the copyrighted material that does not result in copyright infringement [2].
The lawsuit against Anthropic, filed by the authors and their legal team, alleges that the company's AI system was trained using pirated books, which were downloaded from unauthorized sources such as LibGen and PiLiMi. The plaintiffs argue that this practice constitutes copyright infringement and that Anthropic should be held liable for damages. The judge's ruling allows the authors to proceed with a class action lawsuit, which could potentially expand the scope of the case to include all U.S. writers whose works were allegedly infringed upon.
The ruling is significant for several reasons. First, it highlights the potential legal implications of using copyrighted materials in AI training data. Second, it underscores the importance of ensuring that AI developers acquire copyrighted materials lawfully and for a specific, transformative purpose. Finally, it demonstrates the potential for class action lawsuits to address large-scale copyright infringement cases involving AI systems.
The case is still in its early stages, and the outcome remains uncertain. However, the ruling by the California federal judge is a significant development in the ongoing debate about the use of copyrighted materials in AI training data and the legal implications for companies like Anthropic.
References:
[1] https://www.globenewswire.com/news-release/2025/07/17/3117628/3080/en/Reddit-Inc-Class-Action-Levi-Korsinsky-Reminds-Reddit-Inc-Investors-of-the-Pending-Class-Action-Lawsuit-with-a-Lead-Plaintiff-Deadline-of-August-18-2025-RDDT.html
[2] https://www.quarles.com/newsroom/publications/concerned-about-ai-training-data-and-copyrighted-works-new-guidance-from-the-northern-district-of-california
Comments
No comments yet