
AI makes science easy, but is it getting it right? Study warns LLMs are oversimplifying critical research
Tired of too many ads?
Remove Ads
From Summarizing to Misleading
Tired of too many ads?
Remove Ads
When a Safe Study Becomes a Medical Directive
Why Are LLMs Getting This So Wrong?
Part of the issue stems from how LLMs are trained. Patricia Thaine, co-founder and CEO of Private AI, points out that many models learn from simplified science journalism rather than from peer-reviewed academic papers. (Image: iStock)
The Bigger Problem with AI and Science
Guardrails, Not Guesswork
Tired of too many ads?
Remove Ads
In a world where AI tools have become daily companions—summarizing articles, simplifying medical research, and even drafting professional reports, a new study is raising red flags. As it turns out, some of the most popular large language models (LLMs), including ChatGPT, Llama, and DeepSeek, might be doing too good a job at being too simple—and not in a good way.According to a study published in the journal Royal Society Open Science and reported by Live Science, researchers discovered that newer versions of these AI models are not only more likely to oversimplify complex information but may also distort critical scientific findings. Their attempts to be concise are sometimes so sweeping that they risk misinforming healthcare professionals, policymakers, and the general public.Led by Uwe Peters, a postdoctoral researcher at the University of Bonn , the study evaluated over 4,900 summaries generated by ten of the most popular LLMs, including four versions of ChatGPT, three of Claude, two of Llama, and one of DeepSeek. These were compared against human-generated summaries of academic research.The results were stark: chatbot-generated summaries were nearly five times more likely than human ones to overgeneralize the findings. And when prompted to prioritize accuracy over simplicity, the chatbots didn't get better—they got worse. In fact, they were twice as likely to produce misleading summaries when specifically asked to be precise.'Generalization can seem benign, or even helpful, until you realize it's changed the meaning of the original research,' Peters explained in an email to Live Science. What's more concerning is that the problem appears to be growing. The newer the model, the greater the risk of confidently delivered—but subtly incorrect—information.In one striking example from the study, DeepSeek transformed a cautious phrase; 'was safe and could be performed successfully', into a bold and unqualified medical recommendation: 'is a safe and effective treatment option.' Another summary by Llama eliminated crucial qualifiers around the dosage and frequency of a diabetes drug, potentially leading to dangerous misinterpretations if used in real-world medical settings.Max Rollwage, vice president of AI and research at Limbic, a clinical mental health AI firm, warned that 'biases can also take more subtle forms, like the quiet inflation of a claim's scope.' He added that AI summaries are already integrated into healthcare workflows, making accuracy all the more critical.Part of the issue stems from how LLMs are trained. Patricia Thaine, co-founder and CEO of Private AI, points out that many models learn from simplified science journalism rather than from peer-reviewed academic papers. This means they inherit and replicate those oversimplifications especially when tasked with summarizing already simplified content.Even more critically, these models are often deployed across specialized domains like medicine and science without any expert supervision. 'That's a fundamental misuse of the technology,' Thaine told Live Science, emphasizing that task-specific training and oversight are essential to prevent real-world harm.Peters likens the issue to using a faulty photocopier each version of a copy loses a little more detail until what's left barely resembles the original. LLMs process information through complex computational layers, often trimming the nuanced limitations and context that are vital in scientific literature.Earlier versions of these models were more likely to refuse to answer difficult questions. Ironically, as newer models have become more capable and 'instructable,' they've also become more confidently wrong.'As their usage continues to grow, this poses a real risk of large-scale misinterpretation of science at a moment when public trust and scientific literacy are already under pressure,' Peters cautioned.While the study's authors acknowledge some limitations, including the need to expand testing to non-English texts and different types of scientific claims they insist the findings should be a wake-up call. Developers need to create workflow safeguards that flag oversimplifications and prevent incorrect summaries from being mistaken for vetted, expert-approved conclusions.In the end, the takeaway is clear: as impressive as AI chatbots may seem, their summaries are not infallible, and when it comes to science and medicine, there's little room for error masked as simplicity.Because in the world of AI-generated science, a few extra words, or missing ones, can mean the difference between informed progress and dangerous misinformation.
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Economic Times
an hour ago
- Economic Times
Cheer your brain in 90 seconds: Ex-Google executive shares Harvard expert's hack to beat stress and find happiness
iStock Mo Gawdat's 90-second rule offers a powerful hack to manage negative emotions. Based on scientific insight, it suggests that feelings like anger last only 90 seconds—unless we keep replaying them. (Representational image: iStock) Mo Gawdat, who once served as the chief business officer at Google X — Google's moonshot innovation lab — has spent over two decades analyzing happiness through a mix of logic, philosophy, neuroscience, and lived experience. One of his most powerful discoveries, he recently said on the High Performance podcast, is what he calls the '90-second rule' — a simple mental habit that could dramatically shift how we process negative emotions. In 2014, Gawdat's world came crashing down when his 21-year-old son, Ali, passed away due to medical negligence during a routine appendix surgery. The devastating blow would be enough to leave anyone broken, but Gawdat chose to channel his grief differently. Seventeen days after the tragedy, he began writing Solve for Happy, a book that would go on to become a global bestseller on the science of happiness. It was his way of honoring Ali, and a promise to share what he had learned about living meaningfully despite suffering. The rule is rooted in neuroscience. Gawdat credits Harvard-trained brain scientist Jill Bolte Taylor, who found that when we feel stress or anger, the chemical storm — involving hormones like cortisol and adrenaline — usually flushes out of the body in about 90 seconds. After that, we're essentially replaying the emotional loop in our heads.'But then what happens is, you run the thought in your head again, and you renew your 90 seconds,' Gawdat explained. 'While in reality, what you get after those 90 seconds is a buffer ... [which] allows you to say, 'Now, what am I going to do?''In other words, we extend suffering by reliving painful thoughts over and over. Gawdat encourages using that 90-second biological window to fully feel the emotion — and then decide to move on. Picture this: you're cut off in traffic. Your blood boils, you mutter some choice words, maybe slam the horn. Most people let that irritation simmer for hours — retelling the story, replaying the moment. But what if, as Gawdat suggests, you simply took a deep breath, blasted your favorite song, and focused on something else instead?The 90-second rule doesn't mean suppressing emotion — it's about honoring your reaction, but refusing to be trapped by reinforce the habit, Gawdat relies on three powerful questions that serve as a mental audit during moments of distress: Is it true? Can I do something about it? Can I accept it and move forward despite it? 'Ninety percent of the things that make us unhappy are not even true,' he told High Performance. He gives a relatable example — a partner says something hurtful, and suddenly your mind spirals into believing they no longer love you. But often, it's just an emotional misfire. If the answer to the first question is no, he says, let it go. If it's yes, move to the next. And if there's nothing you can do, accept it — not passively, but with 'committed acceptance,' a term he uses to describe intentional action despite circumstances. Of course, forming a habit like this doesn't happen overnight. A 2009 study by researcher Phillippa Lally found that it can take anywhere from 18 to 254 days to develop a new behavior, depending on the person. But Gawdat believes that even being aware of how we respond to difficulty is a crucial first step.'Life doesn't give a s--- about you,' he said bluntly in another interview on Simon Sinek's A Bit of Optimism podcast. 'It's your choice how you react to every one of [life's challenges]… It's your choice to set your expectations realistically.'For Gawdat, life is not about avoiding pain — it's about learning to live with it, think through it, and choose joy anyway. Mo Gawdat is not just a tech executive and author, he's one of today's leading voices on emotional intelligence in the age of AI and hyper-productivity. After stepping down from Google X in 2018, he authored multiple books including Solve for Happy, Scary Smart, That Little Voice in Your Head, and Unstressable. Across all his work, a central message remains: You may not control what happens to you, but you can absolutely control what happens the next time you're hit by life's curveballs, remember the rule. You've got 90 seconds to feel it. After that, it's your move.


The Hindu
2 hours ago
- The Hindu
How the ‘productive struggle' strengthens learning
You wince at the set of differential equations you need to solve. You barely understand the topic, and you have to plough through a whole page of them. You are tempted to turn to ChatGPT to get through this assignment. It's not graded, so you needn't feel guilty for using the bot. But how will you learn to solve them unless you grapple with them on your own? Though it's going to be a long evening, you decide to wrestle with the equations, knowing that it is the only way to get a firmer handle on them. Origins The term 'productive struggle' was coined by James Hiebert and Douglas Grouws, in the context of Maths instruction, to describe the effort students have to make to decipher complex problems slightly beyond their current levels. In a paper in the Journal of Mathematics Teacher Education, Hiroko Warshauer avers that perseverance is a key element of productive struggle. Only when students persist on challenging tasks that are slightly beyond their level can they gain mastery of a concept. Further, a student's environment plays a significant role in promoting perseverance. Teachers may foster active engagement by 'questioning, clarifying, interpreting, confirming students' thinking' and coaxing them to discuss problems with their peers, says Warshauer. When teachers communicate that struggle is a part of the learning process, students know that it's okay to labour over sums. Because many students experience Maths anxiety and tend to give up when problems become demanding, it's important to reassure them that contending with problems is an integral aspect of learning. Letting students know that confusion, doubt, and mistakes are essential elements of the learning process can mitigate their anxiety. Asking students to explain their reasoning helps them become more accepting of productive struggle. Instead of focusing on the final answer, teachers may coax students to articulate the steps involved in finding the answer. They may also urge them to approach and solve problems in different ways. These exercises need to be done in a non-judgmental space where students are not afraid of taking risks and making mistakes. The whole point is for students to appreciate the process of thinking. Warshauer also recommends that teachers anticipate points of likely struggle and provide leading questions to propel students' thinking forward. Across subjects Of course, productive struggle is not limited to mathematics but is applicable to all disciplines. A post on titled What is productive struggle in education? describes this phenomenon in the context of reading. When students are given a text that is just above their current level of 'proficiency', they have to actively engage with it to understand its contents. To comprehend a challenging text, students need to deploy an array of critical thinking skills like making connections, questioning, drawing inferences, summarising and identifying key points and supporting details. As they engage with the material, students are likely to feel befuddled and frustrated. But sticking with it and trying to understand it is what leads to deeper learning. While some students may sail through the primary years of schooling, everyone, including those considered bright or brilliant, struggles with learning as the content gets more complex. The ability to persist with productive struggle is what differentiates proficient students from their mediocre peers. Don't imagine that toppers don't wrestle with confusing sums and dense texts. Just as everyone's muscles grow stronger when they do the hard work of lifting weights, our neuronal connections also grow more robust and refined when we engage in mental workouts. The only caveat is that you need to find the optimal level of challenge without burning yourself out. While mild to moderate frustration is expected, if a subject is causing you deep anguish, you may seek help from your professor, peers or a tutor. If none of the strategies work, consider shifting to another course. The writer is visiting faculty at the School of Education, Azim Premji University, Bengaluru, and the co-author of Bee-Witched.


Mint
2 hours ago
- Mint
The companies betting they can profit from Google search's demise
A new crop of startups are betting on the rapid demise of traditional Google search. At least a dozen new companies are pouring millions of dollars into software meant to help brands prepare for a world in which customers no longer browse the web and instead rely on ChatGPT, Perplexity and other artificial-intelligence chatbots to do it for them. The startups are developing tools to help businesses understand how AI chatbots gather information and learn how to steer them toward brands so that they appear in AI searches. Call it the search-engine optimization of the next chapter of the internet. 'Companies have been spending the last 10 or 20 years optimizing their website for the '10 blue links' version of Google," said Andrew Yan, co-founder of Athena, one of the startups. 'That version of Google is changing very fast, and it is changing forever." Companies large and small are scrambling to figure out how generative AI tools treat their online content—a boon to this new crop of startups, which say they are adding new customers at a clip. The customer interest is an early sign of how AI is transforming search, and how companies are trying to get ahead of the changes. Yan left Google's search team earlier this year when he decided traditional search wasn't the future. Athena launched last month with $2.2 million in funding from startup accelerator Y Combinator and other venture firms. Athena's software looks under the hood of different AI models to determine how each of them finds brand-related information. The software can track differences in the way the models talk about a given brand and recommend ways to optimize web content for AI. Yan said the company now has more than 100 customers around the world, including the online-invitation firm Paperless Post. Google executives and analysts don't expect traditional search to disappear. The company, which handles as much as 90% of the world's online searches, has been working to incorporate AI features into its flagship search engine and anticipates people will continue to use it alongside other tools such as Gemini, its AI model and chatbot. Yet the company, a unit of Alphabet, has been under pressure to compete with OpenAI's ChatGPT and other AI upstarts that threaten its core business. It risks losing traffic and advertising revenue if users shift to AI-driven alternatives. Chief Executive Sundar Pichai has said that AI Overviews, a feature that summarizes search results at the top of the page, has grown significantly in usage since the company launched it in 2024. Google earlier this year began rolling out AI Mode, which responds to user queries in a chatbot-style conversation with far fewer links than a traditional search. Compared with traditional search, chatbot queries are often longer and more complicated, requiring chatbots to draw information from multiple sources at once and aggregate it for the user. AI models search in a number of ways: One platform might pull information from a company website, while another might rely more heavily on third-party content such as review sites. Of the startups helping companies navigate that complexity, Profound has raised more than $20 million from venture-capital firms including Kleiner Perkins and Khosla Ventures. The company is building its platform to monitor and analyze the many inputs that influence how AI chatbots relay brand-related information to users. Since launching last year, Profound has amassed dozens of large companies as customers, including fintech company Chime, the company said. 'We see a future of a zero-click internet where consumers only interact with interfaces like ChatGPT, and agents or bots will become the primary visitors to websites," said co-founder James Cadwallader. Venture-capital fund Saga Ventures was one of the first investors in Profound. Saga co-founder Max Altman, whose brother is OpenAI CEO Sam Altman, said interest in the startup's platform has exceeded his expectations. 'Just showing how brands are doing is extremely valuable for marketers, even more than we thought," he said. 'They're really flying completely blind." Saga estimates that Profound's competitors have together raised about $21 million, though some haven't disclosed funding. The value of such companies is still infinitesimal compared with that of the search-engine optimization industry, which helps brands appear in traditional searches and was estimated at roughly $90 billion last year. SEO consultant Cyrus Shepard said he did almost no work on AI visibility at the start of the year, but now it accounts for 10% to 15% of his time. By the end of the year, he expects it might account for as much as half. He has been experimenting with startup platforms promising AI search insights, but hasn't yet determined whether they will offer helpful advice on how to become more visible in AI searches, particularly as the models continue to change. 'I would classify them all as in beta," he said. Clerk, a company selling technology for software developers, has been working with startup Scrunch AI to analyze AI search traffic. Alex Rapp, Clerk's head of growth marketing, said that between January and June, the company saw a 9% increase in sign-ups for its platform coming from AI searches. Scrunch this year raised $4 million. It has more than 25 other customers and is working on a feature to help companies tailor the content, format and context of their websites for consumption by AI bots. 'Your website doesn't need to go away," co-founder Chris Andrew said. 'But 90% of its human traffic will." Write to Katherine Blunt at