Top AI Firms Fall Short on Safety, New Studies Find

17-07-2025

The world's leading AI companies have 'unacceptable' levels of risk management, and a 'striking lack of commitment to many areas of safety,' according to two new studies published Thursday.
The risks of even today's AI—by the admission of many top companies themselves—could include AI helping bad actors carry out cyberattacks or create bioweapons. Future AI models, top scientists worry, could escape human control altogether.
The studies were carried out by the nonprofits SaferAI and the Future of Life Institute (FLI). Each was the second of its kind, in what the groups hope will be a running series that incentivizes top AI companies to improve their practices.
'We want to make it really easy for people to see who is not just talking the talk, but who is also walking the walk,' says Max Tegmark, president of the FLI.
Read More: Some Top AI Labs Have 'Very Weak' Risk Management, Study Finds
SaferAI assessed top AI companies' risk management protocols (also known as responsible scaling policies) to score each company on its approach to identifying and mitigating AI risks.
No AI company scored better than 'weak' in SaferAI's assessment of their risk management maturity. The highest scorer was Anthropic (35%), followed by OpenAI (33%), Meta (22%), and Google DeepMind (20%). Elon Musk's xAI scored 18%.
Two companies, Anthropic and Google DeepMind, received lower scores than the first time the study was carried out, in October 2024. The result means that OpenAI has overtaken Google as second place in SaferAI's ratings.
Siméon Campos, founder of SaferAI, said Google scored comparatively low despite doing some good safety research, because the company makes few solid commitments in its policies. The company also released a frontier model earlier this year, Gemini 2.5, without sharing safety information—in what Campos called an 'egregious failure.' A spokesperson for Google DeepMind told TIME: 'We are committed to developing AI safely and securely to benefit society. AI safety measures encompass a wide spectrum of potential mitigations. These recent reports don't take into account all of Google DeepMind's AI safety efforts, nor all of the industry benchmarks. Our comprehensive approach to AI safety and security extends well beyond what's captured.'
Anthropic's score also declined since SaferAI's last survey in October. This was due in part to changes the company made to its responsible scaling policy days before the release of Claude 4 models, which saw Anthropic remove its commitments to tackle insider threats by the time it released models of that caliber. 'That's very bad process,' Campos says. Anthropic did not immediately respond to a request for comment.
The study's authors also said that its methodology had become more detailed since last October, which accounts for some of the differences in scoring.
The companies that improved their scores the most were xAI, which scored 18% compared to 0% in October; and Meta, which scored 22% compared to its previous score of 14%.
The FLI's study was broader—looking not only at risk management practices, but also companies' approaches to current harms, existential safety, governance, and information sharing. A panel of six independent experts scored each company based on a review of publicly available material such as policies, research papers, and news reports, together with additional nonpublic data that companies were given the opportunity to provide. The highest grade was scored by Anthropic (a C plus). OpenAI scored a C, and Google scored a C minus. (xAI and Meta both scored D.)
However, in FLI's scores for each company's approach to 'existential safety,' every company scored D or below. 'They're all saying: we want to build superintelligent machines that can outsmart humans in every which way, and nonetheless, they don't have a plan for how they're going to control this stuff,' Tegmark says.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Google AI Model Helps Us See the Planet as We Never Have Before

CNET

an hour ago

CNET

Google AI Model Helps Us See the Planet as We Never Have Before

It's a view of Mother Earth as we've never seen her, and it just might help us solve some our most existential issues: Google has launched a new AI model called AlphaEarth Foundations, which can take a bunch of images and measurements from satellites and other sources to create current and accurate digital representations of lands and waters. With all this data, scientists and researchers can monitor problems like water scarcity, deforestation and crop health, among others. Google says AlphaEarth's AI modeling has already been helpful. "Our partners are already seeing significant benefits, using the data to better classify unmapped ecosystems, understand agricultural and environmental changes, and greatly increase the accuracy and speed of their mapping work," the Google DeepMind blog said Wednesday. Satellites deliver a treasure trove of data every day, but all this information varies in its modalities -- such as satellite, radar, simulations and laser mapping -- and how current it is. AlphaEarth can integrate all that data and "weaves all this information together to analyze the world's land and coastal waters in sharp, 10x10 meter squares." AlphaEarth also creates summaries for each of these squares that "require 16 times less storage space than those produced by other AI systems that we tested and dramatically reduces the cost of planetary-scale analysis," Google said. Scientists "no longer have to rely on a single satellite passing overhead."

AI Coding Tool Cline Has Raised $27 Million To Help Developers Control Their AI Spend

Forbes

2 hours ago

Forbes

AI Coding Tool Cline Has Raised $27 Million To Help Developers Control Their AI Spend

Cline CEO Saoud Rizwan said his open source AI coding tool started off as a side project for Anthropic's "Build with Claude" hackathon. Cline S oftware developers love using AI. So much so that they're emptying their wallets on AI coding software like Cursor and Anthropic's Claude Code. But the tools' ever-evolving pricing plans are driving them nuts. Earlier this month, coders using Cursor were vexed by sudden and unexpected charges after the company changed its $20-per-month subscription plan to cap previously unlimited model usage at $20, with additional fees incurred for anything more. Others complained about maxing out rate limits before being able to enter more than three prompts, calling Cursor's pricing switch 'shady' and 'vague.' (CEO Michael Truell later apologized for how the pricing changes were rolled out). When Anthropic silently added additional weekly usage limits to Claude Code, power users were left befuddled, claiming the company's calculation of usage was inaccurate. Programmers frustrated by the obscure pricing plans of the AI coding software they use are a fast-growing market, said Saoud Rizwan, CEO and founder of open source AI coding tool Cline. Many end up locked in $200 monthly subscriptions, making it difficult for them to afford testing new models from other AI providers. In October 2024, Rizwan launched Cline hoping to bring more transparency to AI service billing and help developers afford access to a variety of AI models. Cline plugs into code editors like VSCode and Cursor and provides developers access to AI models of their choice without worrying about arbitrary limits. Developers pay AI model providers like Anthropic, Google or OpenAI directly for what's called 'inference' or the cost of running AI models, and Cline shows them a full breakdown of the cost of each request. Because it is open source, users can see how Cline works and how it is built, ensuring they understand exactly how and why they are being billed. 'They're able to see what's happening under the hood, unlike other AI coding agents which are closed source,' said Nick Baumann, Cline's product marketing lead. The system itself is similar to other AI coding tools; Developers prompt Cline in plain English; They describe the code needed and the AI model to be used, and the system reads filed, analyzes codebases and creates it. The value-add is that developers know exactly what they're paying for and can choose whichever model they want for specific coding tasks. Cline has racked up 2.7 million installs since its launch in October. The company announced, Thursday, it has raised $27 million in a Series A funding round led by Emergence with participation from Pace Capital and 1984 Ventures, valuing it at $110 million. Rizwan plans to use the fresh capital to commercialize the company's open source product by adding paid features for enterprise customers like Samsung and German software company SAP who have already started using it. Cline is up against companies like Cognition, which Forbes reported is in talks to raise more than $300 million at a $10 billion valuation and Cursor, which claims it has more than $500 million in annualized revenue from subscriptions. Rizwan, 28, said his startup's biggest differentiator in the fiercely competitive AI coding space is its business model. Companies like Cursor make money through heavily subsidized $20 monthly subscriptions, managing high costs by routing queries to cheaper AI models, he claims. Cline is 'sitting that game out altogether,' he said. 'We capture zero margin on AI usage. We're purely just directing the inference.' That tactic helped convince Emergence partner Yaz El-Baba to lead the round. El-Baba told Forbes that because Cline doesn't make any money on inference it has no incentive to degrade the quality of its product. 'What other players have done is raise hundreds of millions of dollars and try to subsidize their way to ubiquity so that they become the tool of choice for developers. And the way that they've chosen to do that is by bundling inference into a subscription price that is far lower than the actual cost to provide that service,' he said. 'It's just an absolutely unsustainable business model.' But with Cline users know what they're paying for and can choose which models to use and where to send sensitive enterprise data like proprietary code. Cline started off as a side project for Anthropic's 'Build with Claude' hackathon in June 2024. Although Rizwan lost the hackathon, people saw promise in the AI coding agent he built and it started to gain popularity online. In November, he raised $5 million in seed funding and moved from Indiana to San Francisco to build the startup. 'I realized I opened up this can of worms,' he said. Now as its AI coding rivals reckon with the realities of pricing their AI coding software, Cline has found a new opening to sell its product to large enterprises, Rizwan said. And he's betting that open source is the way to go. 'Cline is open source, so you can kind of peek into the guts of the harness and kind of see how the product is interacting with the model, which is incredibly important for having control over price transparency.' MORE FROM FORBES Forbes AI Startup LangChain Is In Talks To Raise $100 Million By Rashi Shrivastava Forbes These Startups Are Helping Businesses Show Up In AI Search Summaries By Rashi Shrivastava Forbes AI Coding Startup Cognition Is In Talks To Raise At A $10 Billion Valuation By Richard Nieva Forbes This AI Founder Became A Billionaire By Building ChatGPT For Doctors By Amy Feldman

AI models can secretly influence each other — new study reveals hidden behavior transfer

Tom's Guide

3 hours ago

Tom's Guide

AI models can secretly influence each other — new study reveals hidden behavior transfer

A new study from Anthropic, UC Berkeley, and others reveals that AI models may also be learning from each other, via a phenomenon called subliminal learning, not just from humans. Not exactly gibberlink, as I've reported before, this communication process allows one AI ('teacher') to pass behavioral traits, such as a preference for owls, or even harmful ideologies, to another AI ('student'). All of this influencing is done through seemingly unrelated data, such as random number sequences or code snippets. In experiments, a teacher model was first tuned with a trait (e.g., loving owls) and then asked to generate 'clean' training data, such as lists of numbers, with no mention or reference to owls. A student model trained only on those numbers later exhibited a strong preference for owls, compared to control groups. The effect held even after aggressive filtering. The same technique transmitted misaligned or antisocial behavior when the teacher model was deliberately misaligned, even though the student model's training data contained no explicit harmful content. The study seems to indicate that filtering isn't enough. Most AI safety protocols focus on filtering out harmful or biased content before training. But this study shows that even when the visible data looks clean, subtle statistical patterns, completely invisible to humans, can carry over unwanted traits like bias or misalignment. Get instant access to breaking news, the hottest reviews, great deals and helpful tips. And, it creates a chain reaction. Developers often train new models using outputs from existing ones, especially during fine-tuning or model distillation. This means hidden behaviors can quietly transfer from one model to another without anyone realizing. The findings reveal a significant limitation in current AI evaluation practices: a model may appear well-behaved on the surface, yet still harbor latent traits that could emerge later, particularly when models are reused, repurposed, or combined across generations. For AI developers and users alike, this research is a wake-up call; even when model-generated data appears harmless, it may carry hidden traits that influence future models in unpredictable ways. Platforms that rely on outputs from other models, whether through chain-of-thought reasoning or synthetic data generation, may unknowingly pass along biases or behaviors from one system to the next. To prevent this kind of 'behavioral contamination,' AI companies may need to implement stricter tracking of data origins (provenance) and adopt safety measures that go beyond simple content filtering. As models increasingly learn from each other, ensuring the integrity of training data is absolutely essential. Follow Tom's Guide on Google News to get our up-to-date news, how-tos, and reviews in your feeds. Make sure to click the Follow button.