Meta Wins Fair Use Ruling in AI Training Data Case

Generated by AI AgentCoin World
Wednesday, Jun 25, 2025 9:31 pm ET2min read

In a significant legal development, a federal court in San Francisco ruled that

Inc.’s use of millions of copyrighted books, academic papers, and comics to train its Llama artificial intelligence models constitutes “fair use” under US copyright law. This decision marks a pivotal moment in the ongoing legal battles between tech companies and creators over the use of copyrighted material in AI training.

The lawsuit, initiated by a group of writers including Ta-Nehisi Coates and Richard Kadrey, alleged that Meta had illicitly used a vast catalog of web content without proper authorization. The content in question was obtained through archives like LibGen, a shadow library that hosts material without granting publisher rights. However, US District Judge Vince Chhabria sided with Meta, concluding that the plaintiffs failed to present compelling legal arguments to support their claims.

Meta’s defense centered on the concept of “transformative use,” a key tenet of fair use under US copyright law. The company argued that the training of its AI models was transformative and that the method of acquiring the data was irrelevant. The court agreed, stating that the transformative nature of the technology and the plaintiffs’ lack of compelling counterarguments tipped the scales in Meta’s favor.

This ruling comes on the heels of another significant legal victory for an AI company. Anthropic, the maker of the Claude language models, secured a favorable ruling after demonstrating that it had trained its models on legally acquired physical books. A federal judge in San Francisco compared the Anthropic model’s use of books to a “reader aspiring to be a writer” who uses works to create something different rather than replicate them.

However, the judge also noted that Anthropic’s copying and storage of more than 7 million pirated books in a central library infringed the authors’ copyrights and was not fair use. The company later bought “millions” of print books, but the judge ordered a December trial to determine the extent of damages Anthropic owes for the infringement. The judge clarified that purchasing a copy of a book after it had been stolen would not absolve the company of liability for the theft, although it may affect the extent of statutory damages.

The ruling signals a turning point in the ongoing debate between AI companies and the creative industries. Generative AI models, which power tools like ChatGPT, rely on massive datasets to learn and produce responses. Much of this training data includes copyright-protected material, raising concerns among authors, publishers, and artists over unauthorized use. Judge Chhabria hinted at more persuasive legal avenues in future cases, suggesting that a stronger argument would focus on market dilution—the threat posed to authors by AI-generated content that can saturate markets with machine-created books, music, and art.

“People can prompt generative AI models to produce these outputs using a tiny fraction of the time and creativity that would otherwise be required,” he warned. “This could dramatically undermine the incentive for human beings to create things the old-fashioned way.”

Meta and legal counsel for the plaintiffs have not yet commented on the ruling. The decision is expected to influence the many ongoing lawsuits created by creators against AI firms, as the legal system continues to grapple with how copyright law applies in the era of artificial intelligence.

Sign up for free to continue reading

Unlimited access to AInvest.com and the AInvest app
Follow and interact with analysts and investors
Receive subscriber-only content and newsletters

By continuing, I agree to the
Market Data Terms of Service and Privacy Statement

Already have an account?