All News

Industry & Advocacy News

Mixed Decision in Anthropic AI Case: Authors Guild Responds to Summary Judgment in Bartz v. Anthropic

The Authors Guild responds to summary judgement in Bartz v. Anthropic: A mixed decision for both clarity and fair use

On June 23, the U.S. District Court for the Northern District of California issued a summary judgment order in Bartz v. Anthropic—one of the first major lawsuits brought by authors against an AI company for using unlicensed books to train large language models (LLMs). The ruling, authored by Judge William Alsup, represents the first substantive decision on how fair use applies to generative AI systems.

Background and Established Facts

To understand the ruling, we first need to explain Anthropic’s actions at issue in the lawsuit. According to the court’s order, Anthropic downloaded over seven million books from pirate sites and digitized millions of purchased print books to build a “central library of ‘all the books in the world’” to support the training of its large language models. These pirated and scanned books were stored permanently in an internal digital library, which engineers used to assemble training datasets for Claude, Anthropic’s chatbot. Subsets of these books were copied, cleaned, and processed for use in training different versions of the model. It is undisputed that Anthropic made unauthorized copies from illegally downloaded books and in doing so committed mass copyright infringement; the question before the court was which—if any—of these actions were excused as “fair use,” permitted under the law. As such, the court addressed whether the following actions were fair use:

  1. The copying of pirated and scanned books in training AI;
  2. Scanning lawfully purchased print books into digital form; and
  3. Downloading, building, and retaining a permanent digital library of pirated books.

The court ruled that both #1 and #2—the acts of training Claude on books, however obtained, and digitizing purchased (i.e., legally acquired) print books for internal storage—constituted fair use. However, it found that #3—for Anthropic to knowingly download and copy pirated books for its library (from which it made subset copies to train its AI)—was not fair use. The case will proceed to trial on this issue and the resulting damages.

The Authors Guild’s Assessment

While the Authors Guild is relieved that the court recognized Anthropic’s massive, criminal-level, unexcused ebook piracy for what it is, we disagree with the decision that using pirated or scanned books for training LLMs is fair use. Although the court found that downloading books from pirate sites is “inherently, irredeemably infringing,” it appears to have held that making subsequent copies of those same works for the purposes of training a model (e.g., the translation of the work into tokens, reproduction of bits of the work for training iterations, etc.) are fair use, but that doesn’t absolve Anthropic’s liability for piracy.

The decision that the copies made in training and digitizing books are fair use starkly ignores or misinterprets the legal precedent in several places. It feels as though the court rushed to issue a decision without fully understanding the copyright law and legal issues or the potential harm, and we have no doubt that it will be reversed on appeal. First, Judge Alsup took the extraordinary measure of insisting on hearing the summary judgment motions before discovery was complete and a full record established, especially on current and potential harm to the value of the authors’ works, which he summarily dismisses. Moreover, the decision overlooks established case law, such as the decisions in Capital Records v. Redigi and Hachette v. Internet Archive, which firmly establish that converting from one format to another is not transformative and that doing so without permission can cause grave market harm. Judge Alsup’s decision states that digitizing purchased books is fair use “because all Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies.” This reasoning flies in the face of the law and how the publishing markets work, including how authors license or sell rights separately for electronic, audio, and print markets, as the 1976 Copyright Act specifically allows and intended to promote.

The decision also gives short shrift to the Supreme Court’s 2023 decision in Warhol v. Goldsmith, which clarified that courts should consider the transformativeness of a use in light of its commerciality, and the other factors and subfactors. The Supreme Court in Warhol specifically rejected the line of cases that, through a game of telephone, reduced fair use to the question of whether a work is transformative. Warhol held that transformation is a matter of degree that must be measured and weighed against the commerciality, other factor 1 subfactors, and the other three factors, and warns against weighing the other three factors in a manner that automatically favors fair use once any transformation has been found. Yet that is exactly what the court does in this decision: It summarily finds that the use is transformative, with little explanation, then dismisses each of the other factors in light of its finding of transformation. It also fails to look at the actual harm already being caused by Claude in generating books and articles that directly compete with the originals, much less the potential harm to the licensing marketplace and loss of licensing income. The decision simply states, “A market [for licensing] could develop. Even so, such a market for that use is not one the Copyright Act entitles Authors to exploit.” Why the court ignores that a licensing market has already been developed, much less thinks it can strip authors of those rights, is left unexplained. Lastly, the decision ignores the harm caused to authors and the value of their works due to market saturation and to their reputation when AI mistakes or misattributes their work, as is so often the case.

Pirate Sites Key to AI Training Supply Chain

The order makes the stunning revelation that Anthropic had downloaded more than seven million full-text books from notorious piracy websites—including Books3, Library Genesis (LibGen), and the Pirate Library Mirror (PiLiMi)—despite knowing these books were unauthorized. Internal emails revealed that Anthropic’s leadership deliberately chose to “steal” books over lawful licensing to avoid what they called the “legal/practice/business slog.” This echoes the revelations emerging in other suits, including Kadrey v. Meta, where Meta’s use of criminal pirate sites LibGen and Z-Library was exposed, with a sign-off from CEO Mark Zuckerberg. Both cases show that AI companies have been downloading (and in the case of Meta, reuploading) pirated ebooks en masse to train LLMs with full knowledge at the highest level of decision-makers.

We expect that courts taking note of the staggering scope of piracy will send a clear message to all AI model developers and operators that they must license the books and other commercial copyrighted content that they use and may not help themselves to these works on pirate and other websites.

The Road Forward

In all events, the court’s decision is not the end of the case in the district court. The court made its ruling under a procedure known as summary judgment, which allows a court to decide certain issues without holding a trial if the facts relevant to those issues are not in dispute. Here, Anthropic asked the court to grant summary judgment on issues relating to fair use—essentially asking the court to hold that a trial is unnecessary and to rule entirely in its favor. But the court declined to do that. It granted summary judgment on certain issues—namely, the use of copies for training and the conversion of purchased copies into digital format—but denied Anthropic’s motion as to others. Specifically, the court wrote that “We will have a trial on the pirated copies used to create Anthropic’s central library and the resulting damages, actual or statutory (including for willfulness).” In other words, the court will hold a trial to determine the amount in damages that Anthropic is liable for as a result of its knowing use of pirated books—an activity that the court has determined is not fair use.

In addition, the court declined to grant summary judgment on whether any copies Anthropic made from its central library but not used for training were fair use. The court said that it “cannot determine the right answer concerning such copies because the record is too poorly developed as to them”—largely because “Anthropic has dodged discovery on these points.” The court will need to conduct additional proceedings to resolve that issue.

Finally, the court’s decision can be—and, we expect, will be—appealed. For the reasons already discussed, we are confident that its findings of fair use on the training and format-conversion issues will ultimately be reversed, as they are at odds with well-established Supreme Court and circuit court case law.

Read the full decision here (PDF).

OSZAR »