Daily 8 News | Latest Breaking News & Updates

An AI firm won a lawsuit for copyright infringement — but may face a huge bill for piracy

Los Angeles Times

a day ago

Entertainment
Los Angeles Times

An AI firm won a lawsuit for copyright infringement — but may face a huge bill for piracy

To judge from the reaction among the AI crowd, a federal judge's Monday ruling in a copyright infringement case was a clear win for all the AI firms that use published material to 'train' their chatbots. 'We are pleased that the Court recognized that using works to train [large language models] was transformative — spectacularly so,' Anthropic, the defendant in the lawsuit, boasted after the ruling. 'Transformative' was a key word in the ruling by U.S. Judge William Alsup of San Francisco, because it's a test of whether using copyrighted works falls within the 'fair use' exemption from copyright infringement. Alsup ruled that using copyrighted works to train bots such as Anthropic's Claude is indeed fair use, and not a copyright breach. Anthropic had to acknowledge a troubling qualification in Alsup's order, however. Although he found for the company on the copyright issue, he also noted that it had downloaded copies of more than 7 million books from online 'shadow libraries,' which included countless copyrighted works, without permission. That action was 'inherently, irredeemably infringing,' Alsup concluded. 'We will have a trial on the pirated the resulting damages,' he advised Anthropic ominously: Piracy on that scale could expose the company to judgments worth untold millions of dollars. What looked superficially as a clear win for AI companies in their long battle to use copyrighted material without paying for it to feed their chatbots, now looks clear as mud. That's especially true when Alsup's ruling is paired with a ruling issued Wednesday by U.S. Judge Vince Chhabria, who works out of the same San Francisco courthouse. In that copyright infringement case, brought against Meta Platforms in 2023 by comedian Sarah Silverman and 12 other published authors, Chhabria also ruled that Meta's training its AI bots on copyrighted works is defensible as fair use. He granted Meta's motion for summary judgment. But he provided plaintiffs in similar cases with a roadmap to winning their claims. He ruled in Meta's favor, he indicated, only because the plaintiffs' lawyers failed to raise a legal point that might have given them a victory. More on that in a moment. 'Neither case is going to be the last word' in the battle between copyright holders and AI developers, says Adam Moss, a Los Angeles attorney specializing in copyright law. With more than 40 lawsuits on court dockets around the country, he told me, 'it's too early to declare that either side is going to win the ultimate battle.' With billions of dollars, even trillions, at stake for AI developers and the artistic community at stake, no one expects the law to be resolved until the issue reaches the Supreme Court, presumably years from now. But it's worthwhile to look at these recent decisions — and a copyright lawsuit filed earlier this month by Walt Disney Co., NBCUniversal and other studios against Midjourney, another AI developer — for a sense of how the war is shaping up. To start, a few words about chatbot-making. Developers feed their chatbot models on a torrent of material, much of it scraped from the web — everything from distinguished literary works to random babbling — as well as collections holding millions of books, articles, scientific papers and the like, some of it copyrighted. (Three of my eight books are listed in one such collection, without my permission. I don't know if any have been 'scraped,' and I'm not a party to any copyright lawsuit, as far as I know.) The goal is to 'train' the bots to extract facts and detect patterns in the written material that can then be used to answer AI users' queries in a semblance of conversational language. There are flaws in the process, of course, including the bots' tendency when they can't find an answer in their massive hoard of data to make something up. In their lawsuits, writers and artists maintain that the use of their material without permission to train the bots is copyright infringement, unless they've been paid. The AI developers reply that training falls within the 'fair use' exemption in copyright law, which depends on several factors — if only limited material is drawn from a copyrighted work, if the resulting product is 'transformative,' and if it doesn't significantly cut into the market for the original work. That brings us to the lawsuits at hand. Three authors — novelist Andrea Bartz and nonfiction writers Charles Graeber and Kirk Wallace Johnson — sued Anthropic for using their works without permission. In their lawsuit, filed last year, it emerged that Anthropic had spent millions of dollars to acquire millions of print books, new and used, to feed their bots. 'Anthropic purchased its print copies fair and square,' Alsup wrote. It's generally understood that the owners of books can do almost anything they wish with them, including reselling them. But Anthropic also downloaded copies of more than 7 million books from online 'shadow libraries,' which include untold copyrighted works without permission. Alsup wrote that Anthropic 'could have purchased books, but it preferred to steal them to avoid 'legal/practice/business slog,'' Alsup wrote. (He was quoting Anthropic co-founder and CEO Dario Amodei.) Anthropic told me by email that 'it's clear that we acquired books for one purpose only — building LLMs — and the court clearly held that use was fair.' That's correct as far as it goes. But Alsup found that Anthropic's goal was not only to train LLMs, but to create a general library 'we could use for research' or to 'inform our products,' as an Anthropic executive said, according to legal papers. Chhabria's ruling in the Meta case presented another wrinkle. He explicitly disagreed with Alsup about whether using copyrighted works without permission to train bots is fair use. 'Companies have been unable to resist the temptation to feed copyright-protected materials into their models—without getting permission from the copyright holders or paying them.' He posed the question: Is that illegal? And answered, 'Although the devil is in the details, in most cases the answer will be yes.' Chhabria's rationale was that a flood of AI-generated works will 'dramatically undermine the market' for the original works, and thus 'dramatically undermine the incentive for human beings to create things the old-fashioned way.' Protecting the incentive for human creation is exactly the goal of copyright law, he wrote. 'While AI-generated books probably wouldn't have much of an effect on the market for the works of Agatha Christie, they could very well prevent the next Agatha Christie from getting noticed or selling enough books to keep writing.' Artists and authors can win their copyright infringement cases if they produce evidence showing the bots are affecting their market. Chhabria all but pleaded for the plaintiffs to bring some such evidence before him: 'It's hard to imagine that it can be fair use to use copyrighted make billions or trillions of dollars while enabling the creation of a potentially endless stream of competing works that could significantly harm the market for those books.' But 'the plaintiffs never so much as mentioned it,' he lamented. As a result, he said, he had no choice but to give Meta a major win against the authors. I asked the six law firms representing the authors for their response to Chhabria's implicit criticism of their lawyering, but heard back from only one — Boies Schiller Flexner, which told me by email, 'despite the undisputed record of Meta's historically unprecedented pirating of copyrighted works, the court ruled in Meta's favor. We respectfully disagree with that conclusion.' All this leaves the road ahead largely uncharted. 'Regardless of how the courts rule, I believe the end result will be some form of licensing agreement,' says Robin Feldman, director of the Center for Innovation at UC College of the Law. 'The question is where will the chips fall in the deal and will smaller authors be left out in the cold.' Some AI firms have reached licensing agreements with publishers allowing them to use the latters' copyrighted works to train their bots. But the nature and size of those agreements may depend on how the underlying issues of copyright infringement play out in the courts. Indeed, Chhabria noted that filings in his court documented that Meta was trying to negotiate such agreements until it realized that a shadow library it had downloaded already contained most of the works it was trying to license. At that point it 'abandoned its licensing efforts.' (I asked Meta to confirm Chhabria's version, but didn't get a reply.) The truth is that the AI camp is just trying to get out of paying for something instead of getting it for free. Never mind the trillions of dollars in revenue they say they expect over the next decade — they claim that licensing will be so expensive it will stop the march of this supposedly historic technology dead in its tracks. Chhabria aptly called this argument 'nonsense.' If using books for training is as valuable as the AI firms say they are, he noted, then surely a market for book licensing will emerge. That is, it will — if the courts don't give the firms the right to use stolen works without compensation.

Huge AI copyright ruling offers more questions than answers

Miami Herald

2 days ago

Science
Miami Herald

Huge AI copyright ruling offers more questions than answers

While sci-fi movies from the 1980s and '90s warned us about the potential for artificial intelligence to destroy society, the reality has been much less dramatic so far. Skynet was supposed to be responsible for the rise of killer machines called Terminators that could only be stopped by time travel and plot holes. The AI from "The Matrix" movies also waged a war on its human creators, enslaving the majority of them in virtual reality while driving the rebellion underground. Related: Meta commits absurd money to top Google, Microsoft in critical race To be fair, the artificial intelligence from OpenAI, Google Gemini, Microsoft Copilot, and others does threaten to destroy humanity, but only sometimes. And it looks like the technology is mostly harmless to our chances of survival. But that doesn't mean this transformative tech isn't causing other very real problems. The biggest issue humans currently have with AI is how the companies controlling it train their models. Large language models like OpenAI's ChatGPT need to feast on a lot of information to beat the Voight-Kampff test from "Blade Runner," and a lot of that information is copyrighted. So at the moment, the viability of AI rests in the hands of the courts, not software engineers. This week, the courts handed down a monumental ruling that could have a wide-ranging ripple effect. Image source:This week, Judge William Alsup of the U.S. District Court for the Northern District of California ruled that AI company Anthropic, and others, can train their AI models using published books without the author's consent. The ruling could set an important legal precedent for the dozens of other ongoing AI copyright lawsuits. More on AI: Gemini, ChatGPT may lose the AI war to deep-pocketed rivalAnthropic shows the limits of AI as it scraps blog experimentAmazon's Alexa AI upgrade is even worse than expected A lawsuit filed by three authors accused Anthropic of ignoring copyright laws when it pirated millions of books to train its LLM, but Alsup sided with Anthropic. "The copies used to train specific LLMs were justified as a fair use," Alsup, who has also presided over Oracle America v. Google Inc. and other notable tech trials, wrote in the ruling. "Every factor but the nature of the copyrighted work favors this result. The technology at issue was among the most transformative many of us will see in our lifetimes." Ed Newton Rex, CEO of Fairly Trained, a nonprofit that advocates for ethically compensating creators of the data LLMs get trained on, had a unique take on the verdict after many headlines declared it a broad win for AI companies. "Today's ruling in the authors vs. Anthropic copyright lawsuit is a mixed bag. It's not the win for AI companies some headlines suggest - there are good and bad parts," he said in a lengthy X post this week. "In short, the judge said Anthropic's use of pirated books was infringing, but its training on non-pirated work was fair use." So Anthropic is on the hook for pirating the material, but the judge ruled that it doesn't need the author's permission to train its models. This means Anthropic's fair use argument stood up in court, but the ruling may not be as wide-ranging as it seems. "This is not a blanket ruling that all generative AI training is fair use. Other cases may go the other way, as the facts are different," Newton Rex said. "The Copyright Office has already pointed out that some AI models are more transformative than others - for instance, they singled out AI music models as less transformative. Lobbyists will say this decision confirms that generative AI training is fair use - that's not true." Related: Amazon coders have a surprising reason for hating GenAI The Arena Media Brands, LLC THESTREET is a registered trademark of TheStreet, Inc.

Anthropic's AI copyright ‘win' is more complicated than it looks

Yahoo

2 days ago

Business
Yahoo

Anthropic's AI copyright ‘win' is more complicated than it looks

Big Tech scored a major victory this week in the battle over using copyrighted materials to train AI models. Anthropic won a partial judgment on Tuesday in a case brought by three authors who alleged the company violated their copyright by storing their works in a library used to train its Claude AI model. Kroger is closing 60 stores: See the list of locations that are reportedly shuttering in 2025 so far Humans have irreversibly changed the planet. These photos prove it Trump's Big Beautiful Bill would transfer wealth from young to older Americans. Here's how Judge William Alsup of the U.S. District Court for the Northern District of California ruled that Anthropic's use of copyrighted material for training was fair use. His decision carries weight. 'Authors cannot rightly exclude anyone from using their works for training or learning as such,' Alsup wrote. 'Everyone reads texts, too, then writes new texts. They may need to pay for getting their hands on a text in the first instance. But to make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable.' Alsup called training Claude 'exceedingly transformative,' comparing the model to 'any reader aspiring to be a writer.' That language helps explain why tech lobbyists were quick to call it a major win. Experts agreed. 'It's a pretty big win actually for the future of AI training,' says Andres Guadamuz, an intellectual property expert at the University of Sussex who has closely followed AI copyright cases. But he adds: 'It could be bad for Anthropic specifically, depending on authors winning the piracy issue, but that's still very far away.' In other words, it's not as simple as tech companies might wish. 'The fair use ruling looks bad for creators on its surface, but this is far from the final word on the matter,' says Ed Newton-Rex, a former AI executive-turned-copyright campaigner and founder of Fairly Trained, a nonprofit certifying companies that respect creators' rights. The case is expected to be appealed—and even at this stage, Newton-Rex sees weaknesses in the ruling's reasoning. 'The judge makes assertions about training, not de-incentivizing creation, and about AI learning like humans do, that feel easy to rebut,' he says. 'This is, on balance, a bad day for creators, but it's just the opening move in what will be a long game.' While the judge approved training AI models on copyrighted works, other elements of the case weren't so favorable for Anthropic. Guadamuz says Alsup's decision hinges on a 'solid fair use argument on the transformative nature of AI training.' The judge thoroughly applied the four-factor test for fair use, Guadamuz noted, and the ruling could reshape broader copyright approaches. 'We may start seeing the beginnings of rules for the new world, [where] having legitimate access to a work would work strongly in proving fair use, while using shadow libraries would not,' he says. And that's the catch: This wasn't an unvarnished win for Anthropic. Like other tech companies, Anthropic allegedly sourced training materials from piracy sites for ease—a fact that clearly troubled the court. 'This order doubts that any accused infringer could ever meet its burden of explaining why downloading source copies from pirate sites that it could have purchased or otherwise accessed lawfully was itself reasonably necessary to any subsequent fair use,' Alsup wrote, referring to Anthropic's alleged pirating of more than 7 million books. That alone could carry billions in liability, with statutory damages starting at $750 per book—a trial on that issue is still to come. So while tech companies may still claim victory (with some justification, given the fair use precedent), the same ruling also implies that companies will need to pay substantial sums to legally obtain training materials. OpenAI, for its part, has in the past argued that licensing all the copyrighted material needed to train its models would be practically impossible. Joanna Bryson, a professor of AI ethics at the Hertie School in Berlin, says the ruling is 'absolutely not' a blanket win for tech companies. 'First of all, it's not the Supreme Court. Secondly, it's only one jurisdiction: The U.S.,' she says. 'I think they don't entirely have purchase over this thing about whether or not it was transformative in the sense of changing Claude's output.' This post originally appeared at to get the Fast Company newsletter: Effettua l'accesso per consultare il tuo portafoglio

Judge dismisses authors' copyright lawsuit against Meta over AI training

3 days ago

Business

Judge dismisses authors' copyright lawsuit against Meta over AI training

A federal judge on Wednesday sided with Facebook parent Meta Platforms in dismissing a copyright infringement lawsuit from a group of authors who accused the company of stealing their works to train its artificial intelligence technology. The ruling from U.S. District Judge Vince Chhabri was the second in a week from San Francisco's federal court to dismiss major copyright claims from book authors against the rapidly developing AI industry. Chhabri found that 13 authors who sued Meta 'made the wrong arguments' and tossed the case. But the judge also said that the ruling is limited to the authors in the case and does not mean that Meta's use of copyrighted materials is lawful. Lawyers for the plaintiffs — a group of well-known writers that includes comedian Sarah Silverman and authors Jacqueline Woodson and Ta-Nehisi Coates — didn't immediately respond to a request for comment Wednesday. Meta also didn't immediately respond to a request for comment. 'This ruling does not stand for the proposition that Meta's use of copyrighted materials to train its language models is lawful,' Chhabri wrote. 'It stands only for the proposition that these plaintiffs made the wrong arguments and failed to develop a record in support of the right one.' On Monday, from the same courthouse, U.S. District Judge William Alsup ruled that AI company Anthropic didn't break the law by training its chatbot Claude on millions of copyrighted books, but the company must still go to trial for illicitly acquiring those books from pirate websites instead of buying them. But the actual process of an AI system distilling from thousands of written works to be able to produce its own passages of text qualified as 'fair use' under U.S. copyright law because it was 'quintessentially transformative,' Alsup wrote. Chhabria, in his Meta ruling, criticized Alsup's reasoning on the Anthropic case, arguing that 'Alsup focused heavily on the transformative nature of generative AI while brushing aside concerns about the harm it can inflict on the market for the works it gets trained on.' Chhabria suggested that a case for such harm can be made. In the Meta case, the authors had argued in court filings that Meta is 'liable for massive copyright infringement' by taking their books from online repositories of pirated works and feeding them into Meta's flagship generative AI system Llama. Lengthy and distinctively written passages of text — such as those found in books — are highly useful for teaching generative AI chatbots the patterns of human language. 'Meta could and should have paid' to buy and license those literary works, the authors' attorneys argued. Meta countered in court filings that U.S. copyright law 'allows the unauthorized copying of a work to transform it into something new' and that the new, AI-generated expression that comes out of its chatbots is fundamentally different from the books it was trained on. "After nearly two years of litigation, there still is no evidence that anyone has ever used Llama as a substitute for reading Plaintiffs' books, or that they even could,' Meta's attorneys argued. Meta says Llama won't output the actual works it has copied, even when asked to do so. 'No one can use Llama to read Sarah Silverman's description of her childhood, or Junot Diaz's story of a Dominican boy growing up in New Jersey,' its attorneys wrote. Accused of pulling those books from online 'shadow libraries," Meta has also argued that the methods it used have 'no bearing on the nature and purpose of its use' and it would have been the same result if the company instead struck a deal with real libraries. Such deals are how Google built its online Google Books repository of more than 20 million books, though it also fought a decade of legal challenges before the U.S. Supreme Court in 2016 let stand lower court rulings that rejected copyright infringement claims. The authors' case against Meta forced CEO Mark Zuckerberg to be deposed, and has disclosed internal conversations at the company over the ethics of tapping into pirated databases that have long attracted scrutiny. 'Authorities regularly shut down their domains and even prosecute the perpetrators,' the authors' attorneys argued in a court filing. "That Meta knew taking copyrighted works from pirated databases could expose the company to enormous risk is beyond dispute: it triggered an escalation to Mark Zuckerberg and other Meta executives for approval. Their gamble should not pay off.' "Whatever the merits of generative artificial intelligence, or GenAI, stealing copyrighted works off the Internet for one's own benefit has always been unlawful,' they argued. The named plaintiffs are Jacqueline Woodson, Richard Kadrey, Andrew Sean Greer, Rachel Louise Snyder, David Henry Hwang, Ta-Nehisi Coates, Laura Lippman, Matthew Klam, Junot Diaz, Sarah Silverman, Lysa TerKeurst, Christopher Golden and Christopher Farnsworth. Most of the plaintiffs had asked Chhabria to rule now, rather than wait for a jury trial, on the basic claim of whether Meta infringed on their copyrights. Two of the plaintiffs, Ta-Nehisi Coates and Christopher Golden, did not seek such summary judgment. Chhabri said in the ruling that while he had 'no choice' but to grant Meta's summary judgment tossing the case, 'in the grand scheme of things, the consequences of this ruling are limited. This is not a class action, so the ruling only affects the rights of these 13 authors -- not the countless others whose works Meta used to train its models.'

Mint

3 days ago

Business
Mint

Judge dismisses authors copyright lawsuit against Meta over AI training

A federal judge on Wednesday sided with Facebook parent Meta Platforms in dismissing a copyright infringement lawsuit from a group of authors who accused the company of stealing their works to train its artificial intelligence technology. The ruling from U.S. District Judge Vince Chhabri was the second in a week from San Francisco's federal court to dismiss major copyright claims from book authors against the rapidly developing AI industry. Chhabri found that 13 authors who sued Meta 'made the wrong arguments' and tossed the case. But the judge also said that the ruling is limited to the authors in the case and does not mean that Meta's use of copyrighted materials is lawful. Lawyers for the plaintiffs — a group of well-known writers that includes comedian Sarah Silverman and authors Jacqueline Woodson and Ta-Nehisi Coates — didn't immediately respond to a request for comment Wednesday. Meta also didn't immediately respond to a request for comment. 'This ruling does not stand for the proposition that Meta's use of copyrighted materials to train its language models is lawful,' Chhabri wrote. 'It stands only for the proposition that these plaintiffs made the wrong arguments and failed to develop a record in support of the right one.' On Monday, from the same courthouse, U.S. District Judge William Alsup ruled that AI company Anthropic didn't break the law by training its chatbot Claude on millions of copyrighted books, but the company must still go to trial for illicitly acquiring those books from pirate websites instead of buying them. But the actual process of an AI system distilling from thousands of written works to be able to produce its own passages of text qualified as 'fair use' under U.S. copyright law because it was 'quintessentially transformative,' Alsup wrote. Chhabria, in his Meta ruling, criticized Alsup's reasoning on the Anthropic case, arguing that 'Alsup focused heavily on the transformative nature of generative AI while brushing aside concerns about the harm it can inflict on the market for the works it gets trained on.' Chhabria suggested that a case for such harm can be made. In the Meta case, the authors had argued in court filings that Meta is 'liable for massive copyright infringement' by taking their books from online repositories of pirated works and feeding them into Meta's flagship generative AI system Llama. Lengthy and distinctively written passages of text — such as those found in books — are highly useful for teaching generative AI chatbots the patterns of human language. 'Meta could and should have paid' to buy and license those literary works, the authors' attorneys argued. Meta countered in court filings that U.S. copyright law 'allows the unauthorized copying of a work to transform it into something new' and that the new, AI-generated expression that comes out of its chatbots is fundamentally different from the books it was trained on. "After nearly two years of litigation, there still is no evidence that anyone has ever used Llama as a substitute for reading Plaintiffs' books, or that they even could,' Meta's attorneys argued. Meta says Llama won't output the actual works it has copied, even when asked to do so. 'No one can use Llama to read Sarah Silverman's description of her childhood, or Junot Diaz's story of a Dominican boy growing up in New Jersey,' its attorneys wrote. Accused of pulling those books from online 'shadow libraries," Meta has also argued that the methods it used have 'no bearing on the nature and purpose of its use' and it would have been the same result if the company instead struck a deal with real libraries. Such deals are how Google built its online Google Books repository of more than 20 million books, though it also fought a decade of legal challenges before the U.S. Supreme Court in 2016 let stand lower court rulings that rejected copyright infringement claims. The authors' case against Meta forced CEO Mark Zuckerberg to be deposed, and has disclosed internal conversations at the company over the ethics of tapping into pirated databases that have long attracted scrutiny. 'Authorities regularly shut down their domains and even prosecute the perpetrators,' the authors' attorneys argued in a court filing. "That Meta knew taking copyrighted works from pirated databases could expose the company to enormous risk is beyond dispute: it triggered an escalation to Mark Zuckerberg and other Meta executives for approval. Their gamble should not pay off.' "Whatever the merits of generative artificial intelligence, or GenAI, stealing copyrighted works off the Internet for one's own benefit has always been unlawful,' they argued. The named plaintiffs are Jacqueline Woodson, Richard Kadrey, Andrew Sean Greer, Rachel Louise Snyder, David Henry Hwang, Ta-Nehisi Coates, Laura Lippman, Matthew Klam, Junot Diaz, Sarah Silverman, Lysa TerKeurst, Christopher Golden and Christopher Farnsworth. Most of the plaintiffs had asked Chhabria to rule now, rather than wait for a jury trial, on the basic claim of whether Meta infringed on their copyrights. Two of the plaintiffs, Ta-Nehisi Coates and Christopher Golden, did not seek such summary judgment. Chhabri said in the ruling that while he had 'no choice' but to grant Meta's summary judgment tossing the case, 'in the grand scheme of things, the consequences of this ruling are limited. This is not a class action, so the ruling only affects the rights of these 13 authors -- not the countless others whose works Meta used to train its models.'

Latest news with #Alsup

An AI firm won a lawsuit for copyright infringement — but may face a huge bill for piracy

Huge AI copyright ruling offers more questions than answers

Anthropic's AI copyright ‘win' is more complicated than it looks

Judge dismisses authors' copyright lawsuit against Meta over AI training

Judge dismisses authors copyright lawsuit against Meta over AI training

Get Started Now: Download the App