Judge Rules AI Training on Copyrighted Books Fair Use

Generado por agente de IACoin World
martes, 24 de junio de 2025, 1:28 pm ET3 min de lectura

A federal judge in San Francisco has ruled that training an AI model on copyrighted works without specific permission does not violate copyright law, provided the materials are obtained legally. U.S. District Judge William Alsup stated that AI company Anthropic could assert a “fair use” defense against copyright claims for training its Claude AI models on copyrighted books. However, the judge emphasized that the method of obtaining these books is crucial. Alsup supported Anthropic’s claim that purchasing and digitizing millions of books for AI training constitutes “fair use.” Conversely, downloading millions of pirated copies of books from the internet and maintaining a digital library of these pirated copies is not permissible.

The judge ordered a separate trial to determine Anthropic’s liability and potential damages related to the storage of pirated books. The judge has yet to rule on whether to grant the case class action status, which could significantly increase the financial risks for Anthropic if found guilty of infringing on authors’ rights. Alsup’s ruling addressed a long-standing question in the AI industry: Can copyrighted data be used to train generative AI models without the owner’s consent? This question has been at the center of dozens of AI and copyright-related lawsuits filed over the past three years, most of which revolveRVLV-- around the concept of fair use.

Fair use allows the use of copyrighted material without permission if the use is sufficiently transformative, meaning it serves a new purpose or adds new meaning rather than simply copying or substituting the original work. Alsup’s ruling may set a precedent for these other copyright cases, although many of these rulings are likely to be appealed, meaning it will take years until there is clarity around AI and copyright in the U.S. According to the judge’s ruling, Anthropic’s use of the books to train Claude was “exceedingly transformative” and constituted “fair use under Section 107 of the Copyright Act.” Anthropic argued that its AI training was not only permissible but aligned with the spiritFLYY-- of U.S. copyright law, which it claimed “not only allows, but encourages” such use because it promotes human creativity. The company stated it copied the books to “study Plaintiffs’ writing, extract uncopyrightable information from it, and use what it learned to create revolutionary technology.”

While training AI models with copyrighted data may be considered fair use, Anthropic’s separate action of building and storing a searchable repository of pirated books is not, Alsop ruled. Alsup noted that purchasing a copy of a book after initially stealing it from the internet “will not absolve it of liability for the theft but it may affect the extent of statutory damages.” The judge also questioned Anthropic’s decision to download pirated books to save time and money in building its AI models, stating that “any accused infringerINFR-- could ever meet its burden of explaining why downloading source copies from pirate sites that it could have purchased or otherwise accessed lawfully was itself reasonably necessary to any subsequent fair use.”

The “transformative” nature of AI outputs is crucial but not the only factor in determining fair use. Other considerations include the type of work (creative works receive more protection than factual ones), the amount of the work used (the less, the better), and whether the new use hurts the market for the original. For instance, there is an ongoing case against MetaMETA-- and OpenAI by comedian Sarah Silverman and two other authors, who filed copyright infringement lawsuits in 2023 alleging that pirated versions of their works were used without permission to train AI language models. The defendants argued that the use falls under the fair use doctrine because AI systems “study” works to “learn” and create new, transformative content.

Federal district judge VinceVNCE-- Chhabria pointed out that even if this is true, the AI systems are “dramatically changing, you might even say obliterating, the market for that person’s work.” However, he also criticized the plaintiffs for not providing enough evidence of potential market impacts. Alsup’s decision differed from Chhabria’s on this point. Alsup said that while it was true that Claude could increase competition for the authors’ works, this kind of “competitive or creative displacement is not the kind of competitive or creative displacement that concerns the Copyright Act.” Copyright’s purpose is to encourage the creation of new works, not to shield authors from competition, Alsup said, likening the authors’ objections to Claude to the fear that teaching school children to write well might result in an explosion of competing books. Alsup also noted that Anthropic had built “guardrails” into Claude to prevent it from producing outputs that directly plagiarized the books on which it had been trained. Neither Anthropic nor the plaintiffs’ lawyers immediately responded to requests to comment on the decision.

Comentarios



Add a public comment...
Sin comentarios

Aún no hay comentarios