02-07-2025
Should you trust chatbots with health advice? Google study raises concerns
Ever found yourself typing health-related questions into an AI chatbot instead of consulting a doctor? A new Google-backed study has found that people frequently turn to AI chatbots like ChatGPT and Gemini for health-related advice—but many ask questions in ways that can accidentally trick the chatbot into giving biased or incomplete answers.
The study, titled What's Up, Doc? Analysing How Users Seek Health Information in Large-Scale Conversational AI Datasets, published on arXiv, raises important concerns about the reliability and safety of using chatbots for health guidance.
Conducted by researchers from Google, along with UNC Chapel Hill, Duke University, and the University of Washington, the study analysed over 11,000 real-world health conversations with chatbots to understand what people are really asking, and how those questions may sometimes lead them astray.
According to the study, most people ask about treatments rather than symptoms or general information. The research also found that users often pose incomplete or leading questions, which result in inaccurate or biased chatbot responses.
Why are people turning to AI chatbots for health advice?
The study found that rising healthcare costs and limited access to doctors are pushing more people to seek quick, easy health answers from AI chatbots. It highlighted that around 31 per cent of US adults in 2024 had used generative AI for health-related questions.
What kind of questions? Mostly about treatments, symptoms, lifestyle changes, and diagnostic tests. Chatbots are becoming the go-to digital 'doctor's assistant' for many.
It is about how we ask health questions
The researchers built a special dataset called HealthChat-11K with 25,000 user messages across 21 health specialities.
They found that:
Most conversations are short and focused on quick information
Many users provide incomplete context, often leaving out critical details
Around 23 per cent of treatment-related queries included leading phrases such as 'Is this the best drug for me?' or 'This should work, right?'
These kinds of questions can trigger what researchers call the chatbot's 'sycophancy bias,' meaning the AI might simply agree with the user to sound helpful—even if the advice is incorrect.
In some cases, users asked about unsafe or inappropriate treatments, which the chatbot still partially validated.
Can chatbots handle complex health questions?
The study found that AI chatbots may struggle with vague or partial information. They often assume users know how to describe their conditions properly, which is not always the case. For instance, a person might ask about a medicine without mentioning key medical history, leading to potentially incorrect chatbot responses.
Can AI handle emotional conversations?
The study also found that people sometimes express frustration, confusion, or gratitude during health-related chats. While emotional exchanges were less common, they often marked key turning points—either ending the conversation or spiralling into repetitive loops where the user kept challenging the chatbot.
This highlights a need for AI models to better interpret emotional cues, not just facts.
What can improve health chats with AI?
The researchers suggested the following improvements:
Train chatbots to ask clarifying follow-up questions when context is missing
Design systems that recognise and handle leading or biased user questions
Build models that respond to emotional cues more sensitively
The study reinforces that in health chats with AI, how you ask matters just as much as what you ask. The way a question is framed can significantly influence the quality of the response. So next time you turn to AI for health advice, choose your words carefully—it could make all the difference. For more health updates, follow #HealthWithBS