Latest news with #StephanieCohen


The National
07-07-2025
- Business
- The National
AI audits and 'pay per crawl': How Cloudflare is trying to fix a 'broken' web model
An announcement from network giant Cloudflare is deepening a divide between the worlds of tech and content publishing, which are at odds over the data used to train AI platforms. Cloudflare said publishers using the cloud company's tools for hosting websites will now block AI crawlers by default from accessing and poaching content without permission. 'Upon sign-up with Cloudflare, every new domain will now be asked if they want to allow AI crawlers, giving customers the choice upfront to explicitly allow or deny AI crawlers access,' the company said. 'This significant shift means that every new domain starts with the default of control, and eliminates the need for webpage owners to manually configure their settings to opt out.' Cloudflare first addressed concerns about AI data-scraping last year when it gave websites the option to block AI companies from poaching content. 'Now by default you have control over who crawls your site and what that information is used for,' said Stephanie Cohen, a chief strategy officer for Cloudflare. 'The benefit of that is that it creates the conditions for a new business model of the internet to develop,' she told The National after Cloudflare introduced the new settings and options. Some of the strongest proponents of AI tools, and many of the tools' creators, have justified data scraping, saying it is akin to the early days of search engines when controversy briefly surfaced over whether or not search companies should be able to index sites. Others say that comparison isn't appropriate, because search engines didn't poach the contents of entire websites. Additionally, during the early days of web browsers, search engines and the crawlers they implemented provided a framework that built much of the internet as we know it. It was a win-win situation for the likes of Google and media companies which provided information and sought to attract audiences by delivering web traffic through internet searches. The debut of OpenAI's ChatGPT in 2022 and other AI platforms turned that economic model on its head. Instead of directing traffic to websites, AI summaries have quickly become a destination unto themselves, siphoning traffic from the same websites from which they scrape data. Ms Cohen said publishers and content creators using Cloudflare's services soon noticed a major dip in web traffic. 'Not only was it getting more difficult to get web traffic – to the tune of it being 10 times harder – but it was also getting more difficult at a faster and faster rate,' she said. The web's economics based on search that built up over the last decade, she said, started to erode over a period of six months. In 2024, Ms Cohen said Cloudflare allowed users to see which AI companies were scraping their sites and turn off that ability. This year they are taking things further by introducing 'pay per crawl'. The tool gives publishers and website operators the option of allowing AI scraping for free, charging for it 'at the configured domain-wide price,' or blocking scraping entirely. As AI developments quickens, so too does the bad blood between media organisations and the tech firms driving the AI boom. Several lawsuits have been filed. The New York Times has sued OpenAI and Microsoft for allegedly using its articles to power increasingly popular chatbots.
Yahoo
04-07-2025
- Business
- Yahoo
Cloudflare limits "AI crawlers" from gathering online data
In a few short years, artificial intelligence has exploded into the mainstream, but it hasn't done so alone. AI companies use bots known as "AI crawlers" to comb through websites looking for data to train their AI models, usually without permission. Stephanie Cohen, chief strategy officer at Cloudflare, joins "The Daily Report" to discuss


Time of India
02-07-2025
- Business
- Time of India
Cloudflare launches tool to help website owners monetise AI bot crawler access
Cloudflare has launched a tool that blocks bot crawlers from accessing content without permission or compensation to help websites make money from AI firms trying to access and train on their content, the software company said on Tuesday. The tool allows website owners to choose whether artificial intelligence crawlers can access their material and set a price for access through a "pay per crawl" model, which will help them control how their work is used and compensated, Cloudflare said. With AI crawlers increasingly collecting content without sending visitors to the original source, website owners are looking to develop additional revenue sources as search traffic referrals that once generated advertising revenue decline. The initiative is supported by major publishers including Conde Nast and Associated Press, as well as social media companies such as Reddit and Pinterest. Cloudflare's Chief Strategy Officer Stephanie Cohen said the goal of such tools was to give publishers control over their content, and ensure a sustainable ecosystem for online content creators and AI companies . "The change in traffic patterns has been rapid, and something needed to change," Cohen said in an interview. "This is just the beginning of a new model for the internet." Google, for example, has seen its ratio of crawls to visitors referred back to sites drop to 18:1 from 6:1 just six months ago, according to Cloudflare data, suggesting the search giant is maintaining its crawling but decreasing referrals. The decline could be a result of users finding answers directly within Google's search results, such as AI Overviews. Still, Google's ratio is much higher than other AI companies, such as OpenAI's 1,500:1. For decades, search engines have indexed content on the internet directing users back to websites, an approach that rewards creators for producing quality content. However, AI companies' crawlers have disrupted this model because they harvest material without sending visitors to the original source and aggregate information through chatbots such as ChatGPT, depriving creators of revenue and recognition. Many AI companies are circumventing a common web standard used by publishers to block the scraping of their content for use in AI systems, and argue they have broken no laws in accessing content for free. In response, some publishers, including the New York Times, have sued AI companies for copyright infringement , while others have struck deals to license their content. Reddit, for example, has sued AI startup Anthropic for allegedly scraping Reddit user comments to train its AI chatbot, while inking a content licensing deal with Google.


Time of India
02-07-2025
- Business
- Time of India
Cloudflare launches tool to help website owners monetise AI bot crawler access
Cloudflare has launched a tool that blocks bot crawlers from accessing content without permission or compensation to help websites make money from AI firms trying to access and train on their content, the software company said on Tuesday. The tool allows website owners to choose whether artificial intelligence crawlers can access their material and set a price for access through a "pay per crawl" model, which will help them control how their work is used and compensated, Cloudflare said. With AI crawlers increasingly collecting content without sending visitors to the original source, website owners are looking to develop additional revenue sources as search traffic referrals that once generated advertising revenue decline. The initiative is supported by major publishers including Conde Nast and Associated Press, as well as social media companies such as Reddit and Pinterest. Cloudflare's Chief Strategy Officer Stephanie Cohen said the goal of such tools was to give publishers control over their content, and ensure a sustainable ecosystem for online content creators and AI companies . "The change in traffic patterns has been rapid, and something needed to change," Cohen said in an interview. "This is just the beginning of a new model for the internet." Google, for example, has seen its ratio of crawls to visitors referred back to sites drop to 18:1 from 6:1 just six months ago, according to Cloudflare data, suggesting the search giant is maintaining its crawling but decreasing referrals. The decline could be a result of users finding answers directly within Google's search results, such as AI Overviews. Still, Google's ratio is much higher than other AI companies, such as OpenAI's 1,500:1. For decades, search engines have indexed content on the internet directing users back to websites, an approach that rewards creators for producing quality content. However, AI companies' crawlers have disrupted this model because they harvest material without sending visitors to the original source and aggregate information through chatbots such as ChatGPT, depriving creators of revenue and recognition. Many AI companies are circumventing a common web standard used by publishers to block the scraping of their content for use in AI systems, and argue they have broken no laws in accessing content for free. In response, some publishers, including the New York Times, have sued AI companies for copyright infringement , while others have struck deals to license their content. Reddit, for example, has sued AI startup Anthropic for allegedly scraping Reddit user comments to train its AI chatbot, while inking a content licensing deal with Google.

The Hindu
02-07-2025
- Business
- The Hindu
Cloudflare launches tool to help website owners monetise AI bot crawler access
Cloudflare has launched a tool that blocks bot crawlers from accessing content without permission or compensation to help websites make money from AI firms trying to access and train on their content, the software company said on Tuesday. The tool allows website owners to choose whether artificial intelligence crawlers can access their material and set a price for access through a "pay per crawl" model, which will help them control how their work is used and compensated, Cloudflare said. With AI crawlers increasingly collecting content without sending visitors to the original source, website owners are looking to develop additional revenue sources as search traffic referrals that once generated advertising revenue decline. The initiative is supported by major publishers including Condé Nast and Associated Press, as well as social media companies such as Reddit and Pinterest. Cloudflare's Chief Strategy Officer Stephanie Cohen said the goal of such tools was to give publishers control over their content, and ensure a sustainable ecosystem for online content creators and AI companies. "The change in traffic patterns has been rapid, and something needed to change," Cohen said in an interview. "This is just the beginning of a new model for the internet." Google, for example, has seen its ratio of crawls to visitors referred back to sites drop to 18:1 from 6:1 just six months ago, according to Cloudflare data, suggesting the search giant is maintaining its crawling but decreasing referrals. The decline could be a result of users finding answers directly within Google's search results, such as AI Overviews. Still, Google's ratio is much higher than other AI companies, such as OpenAI's 1,500:1. For decades, search engines have indexed content on the internet directing users back to websites, an approach that rewards creators for producing quality content. However, AI companies' crawlers have disrupted this model because they harvest material without sending visitors to the original source and aggregate information through chatbots such as ChatGPT, depriving creators of revenue and recognition. Many AI companies are circumventing a common web standard used by publishers to block the scraping of their content for use in AI systems, and argue they have broken no laws in accessing content for free. In response, some publishers, including the New York Times, have sued AI companies for copyright infringement, while others have struck deals to license their content. Reddit, for example, has sued AI startup Anthropic for allegedly scraping Reddit user comments to train its AI chatbot, while inking a content licensing deal with Google.