logo
Millions of websites to get 'game-changing' AI bot blocker

Millions of websites to get 'game-changing' AI bot blocker

BBC News5 hours ago
Millions of websites - including Sky News, The Associated Press and Buzzfeed - will now be able to block artificial intelligence (AI) bots from accessing their content without permission.The new system is being rolled out by internet infrastructure firm, Cloudflare, which hosts around a fifth of the internet. Eventually, sites will be able to demand payment from AI firms in return for having their content scraped.Many prominent writers, artists, musicians and actors have accused AI firms of training systems on their work without permission or payment.In the UK, it led to a furious row between the government and artists including Sir Elton John over how to protect copyright.
Cloudflare's tech targets AI firm bots - also known as crawlers - programmes that explore the web, indexing and collecting data as they go. They are important to the way AI firms build, train and operate their systems.So far, Cloudflare says its tech is active on a million websites.Roger Lynch, chief executive of Condé Nast, whose print titles include GQ, Vogue, and The New Yorker, said the move was "a game-changer" for publishers."This is a critical step toward creating a fair value exchange on the Internet that protects creators, supports quality journalism and holds AI companies accountable", he wrote in a statement.However, other experts say stronger legal protections will still be needed.
'Surviving the age of AI'
Initially the system will apply by default to new users of Cloudflare services, plus sites that participated in an earlier effort to block crawlers.Many publishers accuse AI firms of using their content without permission.Recently the BBC threatened to take legal action against US based AI firm Perplexity, demanding it immediately stopped using BBC content, and paid compensation for material already used.However publishers are generally happy to allow crawlers from search engines, like Google, to access their sites, so that the search companies can in return can direct people to their content. Perplexity accused the BBC of seeking to preserve "Google's monopoly". But Cloudflare argues AI breaks the unwritten agreement between publishers and crawlers. AI crawlers, it argues, collect content like text, articles, and images to generate answers, without sending visitors to the original source—depriving content creators of revenue. "If the Internet is going to survive the age of AI, we need to give publishers the control they deserve and build a new economic model that works for everyone," wrote the firm's chief executive Matthew Prince. To that end the company is developing a "Pay Per Crawl" system, which would give content creators the option to request payment from AI companies for utilising their original content.
Battle the bots
According to Cloudflare there has been an explosion of AI bot activity. "AI Crawlers generate more than 50 billion requests to the Cloudflare network every day", the company wrote in March.And there is growing concern that some AI crawlers are disregarding existing protocols for excluding bots.In an effort to counter the worst offenders Cloudflare previously developed a system where the worst miscreants would be sent to a "Labyrinth" of web pages filled with AI generated junk.The new system attempts to use technology to protect the content of websites and to charge AI firms to access it.In the UK there is an intense legislative battle between government, creators and the AI firms over the extent to which the creative industries should be protected from AI firms using their works to train systems without permission or payment.And, on both sides of the Atlantic, content creators, licensors and owners have gone to court in an effort to prevent what they see as AI firms encroachment on creative rights.Ed Newton-Rex, the founder of Fairly Trained which certifies that AI companies have trained their systems on properly licensed data, said it was a welcome development - but there was "only so much" one company could do "This is really only a sticking plaster when what's required is major surgery," he told the BBC."It will only offer protection for people on websites they control - it's like having body armour that stops working when you leave your house," he added."The only real way to protect people's content from theft by AI companies is through the law."
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Key Rulings on GenAI Training and Copyright Fair Use  Practical Law The Journal
Key Rulings on GenAI Training and Copyright Fair Use  Practical Law The Journal

Reuters

time28 minutes ago

  • Reuters

Key Rulings on GenAI Training and Copyright Fair Use Practical Law The Journal

For a regularly updated case tracker covering intellectual property and privacy-related lawsuits concerning GenAI (including more decisions addressing fair use), see Generative AI: Federal Litigation Tracker on Practical Law. Bartz v. Anthropic PBC: N.D. Cal. On June 23, 2025, the US District Court for the Northern District of California held in Bartz v. Anthropic PBC that defendant Anthropic PBC's use of copyrighted books to train its GenAI tool was a fair use and granted summary judgment to Anthropic on this issue. The court also held that Anthropic's digital conversion of purchased print books to build its digital library was fair use but that its downloading of pirated copies for this purpose was not. (2025 WL 1741691 (N.D. Cal. June 23, 2025).) Anthropic PBC developed the GenAI tool Claude, which generates text responses based on prompts from users. In part to train the large language models (LLMs) underlying Claude, Anthropic assembled a central library of digitized books, including copies purchased in print form and then scanned into digital format, as well as copies downloaded from pirate websites. Authors Andrea Bartz, Charles Graeber, Kirk Wallace Johnson, and their affiliated corporate entities (collectively referred to as the 'Authors') filed a putative class action lawsuit against Anthropic in August 2024, alleging copyright infringement for using copies of their books to build its digital library and train the LLMs. Anthropic moved for summary judgment on the issue of fair use. The district court analyzed the fair use factors under Section 107 of the Copyright Act to determine whether Anthropic's uses of the Authors' copyrighted works constituted fair use, separately evaluating the different uses at issue. Weighing the factors, the Bartz court concluded that Anthropic's use of the Authors' books to train the LLMs was fair use. Specifically, the court found that: The first factor (purpose and character of the use) strongly favored fair use because using the works to train the LLMs to generate new text outputs was 'quintessentially transformative.' Key to this finding was that Claude includes software to ensure that it does not output infringing content (and the Authors did not allege that any output was infringing). The court acknowledged that its analysis may change if the outputs were infringing. The second factor (nature of the copyrighted work) weighed against fair use because the court accepted, at the summary judgment stage, that: the Authors' published works contained expressive elements; and the works were selected for their expressive qualities. The third factor (amount and substantiality of the portion used) favored fair use because, although Anthropic copied the Authors' entire works, this was reasonably necessary given the extensive data needed to train the LLMs. The court also stated that what matters is the amount and substantiality of what is made accessible to the public, again noting that there was no allegation or evidence that consumer-facing outputs were infringing. The fourth factor (effect on the potential market) favored fair use because: The district court also considered whether Anthropic's copying of the Authors' works to build its central digital library was fair use. The court separately considered works that were: Lawfully purchased in print format and converted to digital format, after which the print versions were destroyed and the digital versions were not redistributed. Copied from pirate websites without authorization by or compensation to the Authors. For the purchased print copies, the district court held that their conversion for a digital library was a fair use. The court found: The first factor favored fair use because: converting the works from physical to digital format for storage and searchability was a transformative use; and Anthropic did not create additional copies or redistribute the digital versions. The second factor weighed against fair use based on the presumptively (at the summary judgment stage) expressive nature of the works. The third factor favored fair use because copying the entire work was necessary for the purpose of digital conversion and storage. The fourth factor was neutral, as the format change may have displaced some digital purchases, but this did not relate to a market the Copyright Act entitles the Authors to exploit. However, for the pirated library copies downloaded without authorization, the district court found no fair use justification and denied summary judgment to Anthropic. Anthropic copied these pirated works, as a substitute to purchasing them, to build a digital library available for any number of prospective uses (and maintained copies in the library even after deciding they would not be used to train the LLMs). The court held this use is not transformative. The court further recognized that the pirated copies directly displaced demand for purchased copies on a one-to-one basis and that condoning such piracy as fair use would destroy the publishing market. The court rejected Anthropic's arguments that the eventual transformative use of some copies for training the LLMs excused the initial piracy. Kadrey v. Meta Platforms, Inc.: N.D. Cal. On June 25, 2025, the US District Court for the Northern District of California granted summary judgment in Kadrey v. Meta Platforms, Inc. to Meta Platforms, Inc., finding that Meta's use of the plaintiffs' copyrighted books to train its GenAI tool was transformative and fair use. However, the court stated its belief that the fair use defense is likely to be unsuccessful in other, similar cases where the copyright owners adequately show the dilutive harm that GenAI has on the general market for these works. (2025 WL 1752484 (N.D. Cal. June 25, 2025).) The plaintiffs, thirteen authors, filed a lawsuit against Meta alleging direct copyright infringement, among other claims, based on Meta's use of unauthorized downloads of their books (from online shadow libraries) to train the LLMs underlying Llama, Meta's text-generating GenAI tool. After discovery, the parties cross-moved for summary judgment on whether Meta's use of the books was fair use. The district court analyzed the fair use factors under Section 107 of the Copyright Act (17 U.S.C. § 107). In support of its finding that the first fair use factor (purpose and character of the use) favored fair use, the district court: Held that Meta's use was highly transformative because it used the plaintiffs' books only to train LLMs, while the purpose of the books is to entertain and educate readers. The Kadrey court noted that transformative use does not insulate a defendant from infringement liability or even determine the first fair use factor. It is one aspect of the fair use analysis, and there are circumstances where market harm (the fourth factor) can be grounds to reject the defense for a transformative use. Rejected the plaintiffs' law professor amici argument that the purpose and character of the parties' uses were similar because Meta's use of a book to train the LLMs was like a professor's use of a book to train a student. The district court noted that: an LLM ingests text only to learn statistical patterns, not to interpret and understand its meaning as a student does; and Meta's use was not analogous to giving a book to one person, but rather it was to create a tool that everyone can use to exponentially multiply creative expression in a way that teaching a person does not. Rejected the plaintiffs' argument that Meta's use was not transformative because Llama's output mimics and effectively repackages the plaintiffs' works. The evidence showed that Meta programmed Llama to be unable to regurgitate training content, and the plaintiffs' experts were unable to prompt the tool to generate more than 50 words from any of the plaintiffs' books. Recognized that Meta's commercial use (and expectation of 460 billion to 1.4 trillion dollars in revenue over the next ten years) tends to weigh against fair use, but this did not tilt the first fair use factor in the plaintiffs' favor. Recognized that Meta's unauthorized downloading of the books from shadow libraries without compensation to the plaintiffs may indicate bad faith, but questioned the relevance of bad faith and found it did not sway the first factor in this case. The court noted that Meta's practice might be more relevant to the character of the use if the plaintiffs showed the practice benefited the shadow libraries and furthered their unlawful activities. The court held that the second factor (nature of the copyrighted work) weighed against fair use because the plaintiffs' books, consisting mostly of highly expressive works, are entitled to strong copyright protection. However, the court noted that this factor rarely plays a significant role in the fair use analysis. The district court acknowledged that the third factor (amount and substantiality of the portion used) was not particularly relevant in this case. However, it concluded that the factor favored a fair use finding because copying the entirety of the books was reasonable given Meta's transformative purpose, as LLMs perform better when trained on complete, high-quality data. The district court started its review of the fourth fair use factor (effect on the potential market for the copyrighted work) by acknowledging it to be the most important factor in the fair use analysis. It explained that the relevant question is whether the defendant's use will function as a market substitute for the plaintiffs' works. The court rejected the plaintiffs' arguments that: Meta's unauthorized use of the plaintiffs' books affects the market for licensing the works for the purpose of training its LLMs. The district court held that this is not a harm that the Copyright Act seeks to prevent. Otherwise, every copyright infringement plaintiff could argue that it has been deprived of the right to license the use at issue in the case. Llama is capable of reproducing portions of their books, therefore harming the market for the plaintiffs' works. However, the evidence showed that even adversarial prompting designed to make Llama regurgitate the plaintiffs' works yielded only 50-word snippets from the books, which could not have a meaningful effect on the market for the plaintiffs' books. Although the plaintiffs focused only on these two alleged harms, the district court analyzed a third form of harm, that is, GenAI's ability to rapidly generate countless works that compete with and reduce demand for the plaintiffs' works, even if the AI-generated works are non-infringing. The court referred to this form of harm as market dilution (or indirect substitution), which it noted is still market substitution, could reduce the incentive for authors to create, and is the specific harm that copyright aims to prevent. The court stated that market dilution harm is far greater (and therefore more relevant) in the case of GenAI than in other cases involving individual secondary works or digital tools, such as Google Books, because GenAI can quickly flood the market with millions of competing works. The court stated that it 'seems likely' that market dilution will often cause the fourth fair use factor to decisively favor plaintiffs in similar cases. However, in this case, because Meta introduced evidence that its use of the plaintiffs' works did not cause market harm and the plaintiffs failed to demonstrate the contrary, the district court (seemingly reluctantly) found that the fourth factor weighed in favor of fair use. In granting summary judgment to Meta on its fair use defense, the district court stated that its ruling should not indicate that Meta's use of copyrighted content to train its LLMs was lawful, but only that the plaintiffs did not show the market dilution that GenAI causes. The court further surmised that, in many circumstances, the unauthorized use of copyright-protected works to train GenAI models will be infringing and developers will therefore need to pay copyright owners for the right to use their materials for this purpose.

Arsenal ‘close to Viktor Gyokeres transfer with Gunners in advanced talks with Sporting after ruling out Benjamin Sesko'
Arsenal ‘close to Viktor Gyokeres transfer with Gunners in advanced talks with Sporting after ruling out Benjamin Sesko'

The Sun

time33 minutes ago

  • The Sun

Arsenal ‘close to Viktor Gyokeres transfer with Gunners in advanced talks with Sporting after ruling out Benjamin Sesko'

ARSENAL are in advanced talks to sign Viktor Gyokeres after honing in on the Swede over their other target Benjamin Sesko, according to reports. The Gunners have been weighing up their options for a new centre-forward and it would appear they are now focusing on Gyokeres. 3 3 According to Belgian journalist Sacha Tavolieri, Arsenal feel like they are "touching the final line" when it comes to agreeing a deal for the Sporting Lisbon star. While it's also claimed they have already agreed a five-year contract with Gyokeres himself. Gyokeres has found himself trying to force a move away from Sporting this summer after claims that a gentleman's agreement to leave for a cut-price fee between him and the club has not been honoured. Sporting president Federico Varandas has claimed the club are happy to let Gyokeres leave for less than his release clause, but will stand firm when it comes to getting a fair price for the Portuguese league's top scorer. Citing the prices paid for some other players this summer, Varandas told O Jogo: "Sporting does not need to sell him, but we remain sensitive to the dreams of Viktor and any of our athletes. "After weeks of meetings, we are not asking for the release clause and will be reasonable regarding the price we ask for Viktor. Today, I believe there is a strong probability he will leave. 'We have been watching the market and I saw [Martin] Zubimendi, who is six months younger than Viktor, leave for €65million. "I saw Matheus Cunha and Bryan Mbeumo, both forwards but who, in my opinion, do not have Viktor's market value or quality, being negotiated for around €75million. "Given the demands we consider fair, I believe Viktor could leave – unless he has the worst agent in the world, which is hard for me to believe, because he is one of the best footballers in the world." Adding: "I'm not going to say what the price is, the player knows what it is. Arsenal Plot Big Double Transfer For Eberechi Eze and Hugo Ekitike! | Transfers Exposed "I can tell you that Viktor won't leave for €60million plus €10million he won't, he just won't." Meanwhile Fabrizio Romano has claimed that the club's stance has left Gyokeres feeling "betrayed and tired". It's been claimed that Gyokeres has informed Sporting that he won't be returning to the club and has no intention of playing for them again. Gyokeres scored 39 goals in 33 league appearances last season, more than any other player in Europe's top 10 leagues. Arsenal have also been linked with a move for Sesko, but advancements in the Gyokeres deal would likely spell an end to those talks. The Gunners today announced the arrival of Kepa Arrizabalaga from Chelsea on a £5m deal, with the Spanish goalkeeper set to play backup to first choice shot-stopper David Raya next season. While deals for Brentford 's Christian Norgaard and Real Sociedad's Martin Zubimendi are also thought to be on verge of completion. 3

The British political class have shown themselves at their worst
The British political class have shown themselves at their worst

Telegraph

time33 minutes ago

  • Telegraph

The British political class have shown themselves at their worst

The result should never have been in doubt. That whips and ministers were nervous at all should be testament enough as to how badly this government is being run. The welfare reform Bill was finally passed with a majority of 75, about 100 less than Labour's notional majority. But there is something missing from ministers' and MPs' reactions to this 'victory': the cheers, such as they were, sounded forced. The smiles were wan. The congratulations looked half-hearted. Because this is a Bill whose passage means many losers and zero winners – a rare achievement in parliamentary politics. Of course, the real losers are those future claimants of Personal Independence Payments (PIP) who, depending on the detail of the latest concessions granted by Keir Starmer, will find it much more difficult to have their claims approved. But there are many more political losers. There are the rebels themselves, at least some of whom might have hoped for personal advancement in their political careers and who must now face years of being nominated for the crummiest, dullest standing committees – the traditional punishment for those who won't take their whips' advice. Then there are the Conservatives, who voted against a measure many of them clearly supported. There was even a shadow cabinet meeting last week at which Kemi Badenoch asked each member how the party should vote. That such a question even needs to be asked suggests there was at least some support for a more principled, less cynical stance. Then of course there is the Government, which, before this debate and vote, was in a slightly stronger, slightly more popular position than this evening and which now has achieved the passing of a measure that even ministers can no longer see the point of. It has spent a lot of its political credibility in securing a Bill that was originally sold as a genuinely reformist measure (it is not) and which would save the Treasury billions (it will not). Not the Commons' finest moment. A damaged legislature, a damaged government and, most importantly, a damaged prime minister. Happy anniversary, Sir Keir.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store