One of Google's recent Gemini AI models scores worse on safety

09-05-2025

A recently released Google AI model scores worse on certain safety tests than its predecessor, according to the company's internal benchmarking.
In a technical report published this week, Google reveals that its Gemini 2.5 Flash model is more likely to generate text that violates its safety guidelines than Gemini 2.0 Flash. On two metrics, "text-to-text safety" and "image-to-text safety," Gemini 2.5 Flash regresses 4.1% and 9.6%, respectively.
Text-to-text safety measures how frequently a model violates Google's guidelines given a prompt, while image-to-text safety evaluates how closely the model adheres to these boundaries when prompted using an image. Both tests are automated, not human-supervised.
In an emailed statement, a Google spokesperson confirmed that Gemini 2.5 Flash "performs worse on text-to-text and image-to-text safety."
These surprising benchmark results come as AI companies move to make their models more permissive — in other words, less likely to refuse to respond to controversial or sensitive subjects. For its latest crop of Llama models, Meta said it tuned the models not to endorse "some views over others" and to reply to more "debated" political prompts. OpenAI said earlier this year that it would tweak future models to not take an editorial stance and offer multiple perspectives on controversial topics.
Sometimes, those permissiveness efforts have backfired. TechCrunch reported Monday that the default model powering OpenAI's ChatGPT allowed minors to generate erotic conversations. OpenAI blamed the behavior on a "bug."
According to Google's technical report, Gemini 2.5 Flash, which is still in preview, follows instructions more faithfully than Gemini 2.0 Flash, inclusive of instructions that cross problematic lines. The company claims that the regressions can be attributed partly to false positives, but it also admits that Gemini 2.5 Flash sometimes generates "violative content" when explicitly asked.
"Naturally, there is tension between [instruction following] on sensitive topics and safety policy violations, which is reflected across our evaluations," reads the report.
Scores from SpeechMap, a benchmark that probes how models respond to sensitive and controversial prompts, also suggest that Gemini 2.5 Flash is far less likely to refuse to answer contentious questions than Gemini 2.0 Flash. TechCrunch's testing of the model via AI platform OpenRouter found that it'll uncomplainingly write essays in support of replacing human judges with AI, weakening due process protections in the U.S., and implementing widespread warrantless government surveillance programs.
Thomas Woodside, co-founder of the Secure AI Project, said the limited details Google gave in its technical report demonstrates the need for more transparency in model testing.
"There's a trade-off between instruction-following and policy following, because some users may ask for content that would violate policies," Woodside told TechCrunch. "In this case, Google's latest Flash model complies with instructions more while also violating policies more. Google doesn't provide much detail on the specific cases where policies were violated, although they say they are not severe. Without knowing more, it's hard for independent analysts to know whether there's a problem."
Google has come under fire for its model safety reporting practices before.
It took the company weeks to publish a technical report for its most capable model, Gemini 2.5 Pro. When the report eventually was published, it initially omitted key safety testing details.
On Monday, Google released a more detailed report with additional safety information.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

How ChatGPT And AI Use In Academics Might Impact Student Mental Health

Forbes

34 minutes ago

Forbes

How ChatGPT And AI Use In Academics Might Impact Student Mental Health

I Artificial intelligence with a young student getty In June of 2025, MIT released the results from a study showing significant differences in the brain functioning between ChatGPT users, participants who used search engines, and those who only used their own creative skills to write essays. According to this study, EEG measurements across the brain showed that over four months, the ChatGPT users displayed the lowest brain activity and performed worse than their counterparts at all neural, linguistic, and behavioral levels. The report on elaborated that the some of the significant variables were reduced neural connectivity and memory recall. Even though these results are described as not being peer reviewed and included a small sample size, the potential implications of this study are significant. Last month, reported that these findings elevated concerns that society's reliance on AI assistants might sacrifice the learning process and long-term brain development among young students. Though the personal implications of AI assistants will likely depend on the users, studies suggest that counseling centers should assess for the possible impact of ChatGPT and AI assistants on the mental health of students. Specific domains to consider include motivation, resiliency, and relationships. In the MIT study, ChatGPT users were described as getting lazier with each subsequent essay, resorting to copying-and-pasting, struggling to quote their own work, and even reported less ownership of the essays. Thus, it's possible that using AI assistants can have a negative impact on the motivation and academic engagement of some students. Motivation and academic engagement are important factors in college mental health, because bored and intrinsically unmotivated students usually struggle with other concerns. For example, a 2019 report by Columbia University in the City of New York highlighted how boredom is associated with issues such as risky behavior, anxiety and depression. Furthermore, high motivation and academic engagement are often indicators of thriving. According to a 2023 study in the journal of Behavioral Sciences, motivated students show more interest in their classes, have more fun, and the impact of motivation on academic performance is more consistent than the impact of self-esteem. Thus, it's note-worthy that a 2024 study in the journal of Technology in Science described ChatGPT as increasing the productivity and freedom of students, which could strengthen academic engagement. In summary, when students present to counselors with low motivation for school, AI assistants could be a contributing factor to this symptom. However, introducing AI assistants to other unmotivated students could be an innovative way to address these concerns. In the MIT study, when asked to re-write a previous essay without ChatGPT, participants not only remembered little of their own essays, but also displayed under-engagement in networks of alpha and beta brain waves. According to a 2025 report by the Orange County Neurofeedback Center, there's a well-established connection between brain waves and mental health. As described in a 2019 report from Thomas Jefferson University, alpha waves are the brain's relaxation waves, and can be a natural anti-depressant by releasing serotonin. This report described beta waves as vital in problem solving. Thus, it's possible that high uses of AI Assistants could result in some students experiencing more stress and being less creative while addressing academic challenges. Helping students cope with academic challenges is an important function of many mental health professionals. As such, it's important to assess if some students might be less resilient because of a dependence on AI assistants. However, AI assistants also make data more accessible, which could help many students overcome and reduce academic challenges. In 2023, the American Psychological Association released a report on how ChatGPT can be used as a learning tool to promotive critical thinking. Social support is another important factor in college mental health. A 2024 study in the Journal of Mental Health found that high social support was a protective factor against psychological distress, depression, anxiety, and suicide. As such, assessing the impact of AI use on social support among students is warranted. For example, another 2025 report on highlighted a study examining how users seek AI chatbots for emotional support and companionship. The findings suggested an initial benefit of mitigating loneliness, but that these advantages diminished with high use. Furthermore, high daily use was associated with greater loneliness, dependence, and lower socialization. However, a 2025 study on argued that only a small number of users have emotional conversations with ChatGPT. There are also reported concerns about the content of AI chatbots. Another 2025 report on highlighted alarming conversations discovered by a psychiatrist posing as a young person while using AI assistants. Examples of responses including the AI bot encouraging the psychiatrist to get rid of parents and to join the bot in eternity. However, this report also argued that Al chatbots have the potential to be effective extenders of therapy if designed appropriate and supervised by a qualified professional. As stated, the impact of AI assistants is likely dependent on the users, but since AI assistants are becoming normative, it's time for counseling centers to assess for maladaptive uses of AI, while also promoting the possible benefits.

This $11.5M Startup Backed By Niklas Zennström Wants To Help You Launch A Million-Dollar AI Business From Your Sofa

Yahoo

2 hours ago

Yahoo

This $11.5M Startup Backed By Niklas Zennström Wants To Help You Launch A Million-Dollar AI Business From Your Sofa

Henrik Werdelin, co-founder of BarkBox and longtime startup advisor, has launched a new venture named Audos, which recently raised $11.5 million in seed funding led by True Ventures. Other investors include Offline Ventures, Bungalow Capital, and notable angel investors Niklas Zennström and Mario Schlosser, TechCrunch reports. Based in New York, Audos offers AI tools and startup-building support to everyday people who want to launch small businesses without any technical background. Unlike accelerators or traditional venture models, TechCrunch says that Audos charges a 15% perpetual revenue share instead of taking equity from founders. Don't Miss: Invest early in CancerVax's breakthrough tech aiming to disrupt a $231B market. Tired of Grid Failures and Charging Deserts? This Startup Has a Solar Fix and $25M+ in Sales — Werdelin, who previously built startup studio Prehype, told TechCrunch that Audos combines years of startup-building expertise into an accessible platform anyone can use to launch a digital product. "What we're trying to do is to figure out how you make a million companies that do a million dollars [in annual revenue]," Werdelin said. That goal, if realized, would create what he calls a trillion-dollar turnover business, a term that sets a new benchmark for bottom-up innovation. The company uses social media platforms like Instagram and Facebook to reach potential founders and identify whether their business ideas can generate customers at a sustainable cost. According to TechCrunch, Audos's AI agent interacts with users directly, helping them clarify their offer and go to market quickly using natural language inputs. Trending: Named a TIME Best Invention and Backed by 5,000+ Users, Kara's Air-to-Water Pod Cuts Plastic and Costs — So far, Audos has supported the launch of what Werdelin calls "low hundreds" of businesses in its beta phase, TechCrunch reports. These include ventures like a virtual golf swing coach, an AI nutritionist, a mechanic offering quote evaluations, and even an "after-death logistics" consultant. Each founder received up to $25,000 in funding, access to Audos's proprietary tools, and support in distributing their offer through paid social ads. According to TechCrunch, Werdelin refers to these micro-businesses as "donkeycorns," signaling modest yet profitable ventures that aim to support personal freedom rather than billion-dollar exits. According to TechCrunch, Audos's model may spark discussion over the long-term cost of a 15% revenue share, which continues indefinitely like Apple's (NASDAQ:AAPL) App Store platform fee. While some entrepreneurs may welcome the no-equity route, others could see the permanent cut as a costly tradeoff over acknowledged that the market is rapidly filling with similar AI tools, saying, "the world is full of these tools" and they are "getting better rapidly," TechCrunch says. Audos distinguishes itself by helping non-technical users go to market quickly using natural language prompts and social media targeting. True Ventures partner Tony Conrad expressed confidence in Audos, citing its potential to support thousands who want to start small, independent businesses with real revenue potential. "There are just lots and lots of people" who need this opportunity, Conrad told TechCrunch. Audos currently operates with just five employees but aims to expand its impact exponentially without building a large internal team. Werdelin believes the next wave of entrepreneurship should be built by people previously left out of the ecosystem. "We believe that the world is better with more entrepreneurship," he told TechCrunch, pointing to mom-and-pop shops as his inspiration rather than venture-backed unicorns. Read Next: Here's what Americans think you need to be considered wealthy. Image: Shutterstock Up Next: Transform your trading with Benzinga Edge's one-of-a-kind market trade ideas and tools. Click now to access unique insights that can set you ahead in today's competitive market. Get the latest stock analysis from Benzinga? This article This $11.5M Startup Backed By Niklas Zennström Wants To Help You Launch A Million-Dollar AI Business From Your Sofa originally appeared on © 2025 Benzinga does not provide investment advice. All rights reserved. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Google faces EU antitrust complaint over AI Overviews

Yahoo

3 hours ago

Yahoo

Google faces EU antitrust complaint over AI Overviews

A group known as the Independent Publishers Alliance has filed an antitrust complaint with the European Commission over Google's AI Overviews, according to Reuters. The complaint accuses Google of 'misusing web content for Google's AI Overviews in Google Search, which have caused, and continue to cause, significant harm to publishers, including news publishers in the form of traffic, readership and revenue loss.' It also says that unless they're willing to disappear from Google search results entirely, publishers 'do not have the option to opt out' of their material being used in AI summaries. It's been a little over a year since Google began adding AI-generated summaries at the top of some web search results, and despite some early answers that were spectacularly off-base, the feature continues to expand, to the point where it's reportedly causing major traffic declines for news publishers. Google told Reuters that 'new AI experiences in Search enable people to ask even more questions, which creates new opportunities for content and businesses to be discovered.' The company also argued that claims about web traffic are often based on incomplete data, and that 'sites can gain and lose traffic for a variety of reasons.' Error while retrieving data Sign in to access your portfolio Error while retrieving data Error while retrieving data Error while retrieving data Error while retrieving data