AI chatbots oversimplify scientific studies and gloss over critical details — the newest models are especially guilty

a day ago

When you buy through links on our articles, Future and its syndication partners may earn a commission.
Large language models (LLMs) are becoming less "intelligent" in each new version as they oversimplify and, in some cases, misrepresent important scientific and medical findings, a new study has found.
Scientists discovered that versions of ChatGPT, Llama and DeepSeek were five times more likely to oversimplify scientific findings than human experts in an analysis of 4,900 summaries of research papers.
When given a prompt for accuracy, chatbots were twice as likely to overgeneralize findings than when prompted for a simple summary. The testing also revealed an increase in overgeneralizations among newer chatbot versions compared to previous generations.
The researchers published their findings in a new study April 30 in the journal Royal Society Open Science.
"I think one of the biggest challenges is that generalization can seem benign, or even helpful, until you realize it's changed the meaning of the original research," study author Uwe Peters, a postdoctoral researcher at the University of Bonn in Germany, wrote in an email to Live Science. "What we add here is a systematic method for detecting when models generalize beyond what's warranted in the original text."
It's like a photocopier with a broken lens that makes the subsequent copies bigger and bolder than the original. LLMs filter information through a series of computational layers. Along the way, some information can be lost or change meaning in subtle ways. This is especially true with scientific studies, since scientists must frequently include qualifications, context and limitations in their research results. Providing a simple yet accurate summary of findings becomes quite difficult.
"Earlier LLMs were more likely to avoid answering difficult questions, whereas newer, larger, and more instructible models, instead of refusing to answer, often produced misleadingly authoritative yet flawed responses," the researchers wrote.
Related: AI is just as overconfident and biased as humans can be, study shows
In one example from the study, DeepSeek produced a medical recommendation in one summary by changing the phrase "was safe and could be performed successfully" to "is a safe and effective treatment option."
Another test in the study showed Llama broadened the scope of effectiveness for a drug treating type 2 diabetes in young people by eliminating information about the dosage, frequency, and effects of the medication.
If published, this chatbot-generated summary could cause medical professionals to prescribe drugs outside of their effective parameters.
In the new study, researchers worked to answer three questions about 10 of the most popular LLMs (four versions of ChatGPT, three versions of Claude, two versions of Llama, and one of DeepSeek).
They wanted to see if, when presented with a human summary of an academic journal article and prompted to summarize it, the LLM would overgeneralize the summary and, if so, whether asking it for a more accurate answer would yield a better result. The team also aimed to find whether the LLMs would overgeneralize more than humans do.
The findings revealed that LLMs — with the exception of Claude, which performed well on all testing criteria — that were given a prompt for accuracy were twice as likely to produce overgeneralized results. LLM summaries were nearly five times more likely than human-generated summaries to render generalized conclusions.
The researchers also noted that LLMs transitioning quantified data into generic information were the most common overgeneralizations and the most likely to create unsafe treatment options.
These transitions and overgeneralizations have led to biases, according to experts at the intersection of AI and healthcare.
"This study highlights that biases can also take more subtle forms — like the quiet inflation of a claim's scope," Max Rollwage, vice president of AI and research at Limbic, a clinical mental health AI technology company, told Live Science in an email. "In domains like medicine, LLM summarization is already a routine part of workflows. That makes it even more important to examine how these systems perform and whether their outputs can be trusted to represent the original evidence faithfully."
Such discoveries should prompt developers to create workflow guardrails that identify oversimplifications and omissions of critical information before putting findings into the hands of public or professional groups, Rollwage said.
While comprehensive, the study had limitations; future studies would benefit from extending the testing to other scientific tasks and non-English texts, as well as from testing which types of scientific claims are more subject to overgeneralization, said Patricia Thaine, co-founder and CEO of Private AI — an AI development company.
Rollwage also noted that "a deeper prompt engineering analysis might have improved or clarified results," while Peters sees larger risks on the horizon as our dependence on chatbots grows.
"Tools like ChatGPT, Claude and DeepSeek are increasingly part of how people understand scientific findings," he wrote. "As their usage continues to grow, this poses a real risk of large-scale misinterpretation of science at a moment when public trust and scientific literacy are already under pressure."
RELATED STORIES
—Cutting-edge AI models from OpenAI and DeepSeek undergo 'complete collapse' when problems get too difficult, study reveals
—'Foolhardy at best, and deceptive and dangerous at worst': Don't believe the hype — here's why artificial general intelligence isn't what the billionaires tell you it is
—Current AI models a 'dead end' for human-level intelligence, scientists agree
For other experts in the field, the challenge we face lies in ignoring specialized knowledge and protections.
"Models are trained on simplified science journalism rather than, or in addition to, primary sources, inheriting those oversimplifications," Thaine wrote to Live Science.
"But, importantly, we're applying general-purpose models to specialized domains without appropriate expert oversight, which is a fundamental misuse of the technology which often requires more task-specific training."
In December 2024, Future Publishing agreed a deal with OpenAI in which the AI company would bring content from Future's 200-plus media brands to OpenAI's users. You can read more about the partnership here.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Italy's Bridge to Nowhere Shows Defense-Boom Risks

Bloomberg

an hour ago

Bloomberg

Italy's Bridge to Nowhere Shows Defense-Boom Risks

The defense boom in Europe is as close to a tech-style gold rush as the Old Continent can offer. Armaments stocks are outperforming Nvidia Corp., and defense-themed funds are amassing billions in anticipation of rising military spending in a more dangerous world. NATO allies have agreed to more than double defense spending goals to 5% of gross domestic product in the coming years. But with so many countries already struggling to stump up the billions needed to keep up in artificial intelligence, reindustrialization and the energy transition, where's the cash going to come from? With the notable exception of Germany, many European countries are already near the limit of investor and voter patience with borrowing and taxation. And good luck shrinking the welfare state.

New Ferrari Hypersail Project Will Be Like An Oceangoing Ferrari Hypercar From The Future

Forbes

3 hours ago

Forbes

New Ferrari Hypersail Project Will Be Like An Oceangoing Ferrari Hypercar From The Future

Construction of Ferrari's Hypersail project is well underway We all know the legendary Italian brands that produce some of the most revered fashion, cars, and yachts in the world. But Ferrari is different. Ferrari is a religion. And now that Ferrari has just introduced its new Ferrari Hypersail project, high-performance hydrofoiling sailboat technology might never be the same again. The Ferrari Hypersail project has just been introduced Led by Team Principal Giovanni Soldini, Hypersail is a research and development platform focused on offshore sailing onboard a 100-foot-long hydrofoiling yacht is designed by French naval architect Guillaume Verdier to stabilize its flight on three points of contact. Verdier's most notable innovation is the use of a canting keel as the support for one of the foils, with the other two contact points being a foil on the rudder and, alternately, the two lateral foils. Ferrari Hypersail's Giovani Soldini is one of the most experienced offshore racing sailors on the ... More planet 'Hypersail is a new challenge that pushes us to go beyond our boundaries and expand our technological horizons,' says Ferrari Chairman John Elkann. 'At the same time, it perfectly aligns with Ferrari's tradition, drawing inspiration from our Hypercar, three-time winner of the 24 Hours of Le Mans. Designing a yacht for offshore racing is perhaps the ultimate expression of endurance. 'Giovanni Soldini is a key pillar of this project, not only because of his achievements as a sailor but also his unmatched experience in yacht development and construction. The excellent teamwork between Ferrari and Guillaume Verdier is bringing into existence a unique boat that will fly across the oceans, representing a real opportunity for innovation in both the nautical and automotive worlds.' advance drawings of Ferrari's new Hypersail project 'I'm happy and honored to be part of this adventure,' said Giovanni Soldini, Team Principal of Hypersail. 'An exciting challenge, backed by a truly unique team that brings together Ferrari's excellence and the expertise of specialists in ocean sailing design. The meeting of different cultures and advanced technologies is enabling us to build a yacht that is revolutionary in many respects. 'From a nautical perspective, it's innovative in both its structure and how it will fly. On the systems front, Ferrari's contribution is driving the development of on-board control technology that has never been seen before. To prepare as well as we can for the variability and force of the phenomena and conditions encountered at sea, our top priority is to strike the right balance between the pursuit of extreme performance and maximum reliability.' According to advance reports the Hypersail project is aiming for close collaboration between Ferrari, its partners, and suppliers. This approach will be used to develop systems in the areas of aerodynamics, energy efficiency, power management, and kinetic energy. The Ferrari Hypersail project is being built now and will launch next year Technology transfer from the realm of Ferrari sports cars is also key. The yacht will sail with a flight control system developed from the expertise acquired in the automotive sector – employing aerodynamic and structural calculation processes designed to ensure performance and safety for a monohull that will sail across oceans for extended periods with no external support of any kind. The monohull is also designed to operate exclusively using renewable energy sources, including solar, wind and kinetic energy. There is no combustion engine on board, and all the power required to run the control and motion systems for the foils, keel and rudder, as well as the full suite of on-board computers and instruments, must be generated autonomously while under sail. Mama Mia! The prototype is already under construction and due to launch next year. Watch this space.

Cryptocurrency Investment News: AAS Miner Launches the World's First AI-Driven Bitcoin Mining Platform, Empowering Global Investors to Cope with Bitcoin Halving and ETF Regulatory Trends

Associated Press

3 hours ago

Associated Press

Cryptocurrency Investment News: AAS Miner Launches the World's First AI-Driven Bitcoin Mining Platform, Empowering Global Investors to Cope with Bitcoin Halving and ETF Regulatory Trends

LONDON, UK, July 06, 2025 (GLOBE NEWSWIRE) -- In 2025, the cryptocurrency market will usher in a new round of changes. The fourth Bitcoin block reward halving has officially landed, the market supply has dropped sharply, and global investors' consensus on the long-term value of BTC is further strengthened; at the same time, the first batch of Bitcoin ETFs in the United States have officially passed regulatory approval, marking that traditional finance and digital assets are deeply integrated, and compliance has become the new normal for investment. In this wave of change, AAS Miner has taken advantage of the trend and launched the world's first AI-driven free cloud mining platform, providing global cryptocurrency investors with a zero-threshold, zero-burden, and highly transparent passive income solution. Through the integration and innovation of technology and finance, AAS Miner has truly implemented the concept of 'everyone can mine', opening a new entrance to digital wealth for ordinary users. One-click AI smart mining, completely bid farewell to the 'high threshold' era Traditional cryptocurrency mining often requires high mining machine costs, electricity input and complex technical configuration, which is almost impossible for ordinary users to participate. AAS Miner has completely overturned this situation - users only need to complete registration through the official website ( ) to enjoy a 10 USDT novice reward; after downloading the App, you can also get an additional 0.80 USDT incentive for daily login, realizing a truly 0-cost start and easy profit. The core of the platform is driven by the self-developed AI intelligent computing power scheduling system, which can automatically optimize mining strategies according to the real-time fluctuations of global computing power resources and currency prices, and achieve an all-weather efficient, low-consumption, and uninterrupted mining experience, without the need for users to master any technical knowledge or manual intervention. 13 flexible contracts, covering 2~365 days, meeting diversified investment needs In order to meet the needs of investors with different risk preferences and capital planning, AAS Miner has launched 13 cloud computing power contract products, covering periods from 2-day short-term arbitrage to 365-day long-term lock-up, truly realizing flexible switching of income models and high freedom of asset allocation. All contract income is automatically settled by blockchain smart contracts, and the income is automatically credited daily and can be withdrawn at any time. The liquidity of funds is extremely high, which is suitable for novice users entering the market and also meets the dual pursuit of 'stable + flexible' by experienced senior investors. Global green energy mines support, embrace sustainable crypto economy As a pioneer in green mining in the industry, AAS Miner has deployed 100+ environmentally friendly mines around the world, all driven by 100% clean energy (solar energy, wind energy, and hydropower), responding to the global ESG development concept and carbon neutrality strategy. This low-carbon, low-energy AI cloud mining method not only conforms to the development trend of the times, but also brings users a more stable and socially responsible investment experience - protecting the future of the earth while creating wealth. Safety and compliance, building a foundation of trust AAS Miner takes 'compliance, safety, and transparency' as the core principles and builds a full-link financial-level risk control system. The platform fully complies with the KYC/AML international anti-money laundering regulations, strictly reviews user identities, and makes the flow of funds clear and traceable. At the financial level, the platform adopts a bank-level custody system and multiple encryption mechanisms to ensure that user assets are 100% safe. At the same time, a third-party auditing agency is introduced to conduct regular reviews of computing power contracts, income data, and asset management. All mining income and transaction data can be verified on the chain, which is transparent and credible. Multi-currency support, creating a global income matrix AAS Miner not only supports mainstream cryptocurrencies such as BTC, ETH, DOGE, etc., but also opens mining options for multiple currencies such as BCH, XRP, LTC, SOL, USDT, USDC, etc., building a diversified asset portfolio structure to help investors stabilize value-added in different market cycles and achieve synergistic growth of all-currency income. Whether you are a firm believer in heavy Bitcoin or a prudent investor who is optimistic about stablecoin income, AAS Miner can customize a personalized mining plan for you and provide a 'tailor-made' digital asset growth path. Conclusion: AAS Miner, a trustworthy digital wealth growth engine When the crypto market enters a new era with compliance, intelligence, and decentralization as the core, AAS Miner uses technology, responsibility and innovation to open the 'easy, safe, and high-yield' wealth door for global investors. Without high investment or complicated operations, you can easily participate in the real-time mining of mainstream crypto assets such as Bitcoin with just a mobile phone. Visit the official website( ) now, download the App, register and get $10 to start your passive income journey. AAS Miner will accompany you through the bull and bear markets and build a new era of future encryption. Official Website: Contact Email: [email protected]Name: DOLAN Peter James Email: [email protected] Job Title: Marketing Manager

AI chatbots oversimplify scientific studies and gloss over critical details — the newest models are especially guilty

Hashtags

Try Our AI Features

Comments

Related Articles

Italy's Bridge to Nowhere Shows Defense-Boom Risks

New Ferrari Hypersail Project Will Be Like An Oceangoing Ferrari Hypercar From The Future

Cryptocurrency Investment News: AAS Miner Launches the World's First AI-Driven Bitcoin Mining Platform, Empowering Global Investors to Cope with Bitcoin Halving and ETF Regulatory Trends

Get Started Now: Download the App