logo
OpenAI's latest AI models report high ‘hallucination' rate: What does it mean — and why is this significant?

OpenAI's latest AI models report high ‘hallucination' rate: What does it mean — and why is this significant?

Indian Express15-05-2025
A technical report released by artificial intelligence (AI) research organisation OpenAI last month found that the company's latest models — o3 and o4-mini — generate more errors than its older models. Computer scientists call the errors made by chatbots 'hallucinations'.
The report revealed that o3 — OpenAI's most powerful system — hallucinated 33% of the time when running its PersonQA benchmark test, which involves answering questions about public figures. The o4-mini hallucinated at 48%.
To make matters worse, OpenAI said it does not even know why these models are hallucinating more than their predecessors.
Here is a look at what AI hallucinations are, why they happen, and why the new report about OpenAI's models is significant.
When the term AI hallucinations began to be used to refer to errors made by chatbots, it had a very narrow definition. It was used to refer to those instances when AI models would give fabricated information as output. For instance, in June 2023, a lawyer in the United States admitted using ChatGPT to help write a court filing as the chatbot had added fake citations to the submission, which pointed to cases that never existed.
Today, hallucination has become a blanket term for various types of mistakes made by chatbots. This includes instances when the output is factually correct but not actually relevant to the question that was asked.
ChatGPT, o3, o4-mini, Gemini, Perplexity, Grok and many more are all examples of what are known as large language models (LLMs). These models essentially take in text inputs and generate synthesised outputs in the form of text.
LLMs are able to do this as they are built using massive amounts of digital text taken from the Internet. Simply put, computer scientists feed these models a lot of text, helping them identify patterns and relationships within that text, and predict text sequences and produce some output in response to a user's input (known as a prompt).
Note that LLMs are always making a guess while giving an output. They do not know for sure what is true and what is not — these models cannot even fact-check their output against, let's say, Wikipedia like humans can.
LLMs 'know what words are and they know which words predict which other words in the context of words. They know what kinds of words cluster together in what order. And that's pretty much it. They don't operate like you and me,' scientist Gary Marcus wrote on his Substack, Marcus on AI.
As a result, when an LLM is trained on, for example, inaccurate text, they give inaccurate outputs, thereby hallucinating.
However, even accurate text cannot stop LLMs from making mistakes. That's because to generate new text (in response to a prompt), these models combine billions of patterns in unexpected ways. So, there is always a possibility that LLMs give fabricated information as output.
And as LLMs are trained on vast amounts of data, experts do not understand why they generate a particular sequence of text at a given moment.
Hallucination has been an issue with AI models from the start, and big AI companies and labs, in the initial years, repeatedly claimed that the problem would be resolved in the near future. It did seem possible, as after they were first launched, models tended to hallucinate less with each update.
However, after the release of the new report about OpenAI's latest models, it has increasingly become clear that hallucination is here to stay. Also, the issue is not limited to just OpenAI. Other reports have shown that Chinese startup DeepSeek's R-1 model has double-digit rises in hallucination rates compared with previous models from the company.
This means that the application of AI models has to be limited, at least for now. They cannot be used, for example, as a research assistant (as models create fake citations in research papers) or a paralegal-bot (because models give imaginary legal cases).
Computer scientists like Arvind Narayanan, who is a professor at Princeton University, think that, to some extent, hallucination is intrinsic to the way LLMs work, and as these models become more capable, people will use them for tougher tasks where the failure rate will be high.
In a 2024 interview, he told Time magazine, 'There is always going to be a boundary between what people want to use them [LLMs] for, and what they can work reliably at… That is as much a sociological problem as it is a technical problem. And I do not think it has a clean technical solution.'
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

‘We will never stop fighting for…': Billionaire Tyler Winklevoss to JP Morgan CEO Jamie Dimon for ‘rejecting' his crypto exchange
‘We will never stop fighting for…': Billionaire Tyler Winklevoss to JP Morgan CEO Jamie Dimon for ‘rejecting' his crypto exchange

Time of India

time2 hours ago

  • Time of India

‘We will never stop fighting for…': Billionaire Tyler Winklevoss to JP Morgan CEO Jamie Dimon for ‘rejecting' his crypto exchange

Representative Image Billionaire co-founder of US-based cryptocurrency exchange Gemini, Tyler Winklevoss , has once again taken a dig at JPMorgan Chase. The latest jibe comes after JP Morgan allegedly paused the onboarding of his crypto exchange as a customer due to his recent public remarks against the bank on the social media platform X. All this started last week after Winklevoss shared a Bloomberg report that claimed JP Morgan has decided to start charging fintech companies for accessing its customers' bank account information. In this X post, the crypto billionaire shared his concern, noting that this action would 'bankrupt fintechs' that help consumers connect their bank accounts to exchanges like Gemini to purchase bitcoin and other cryptocurrencies. What Winklevoss said to JP Morgan Chase and CEO Jamie Dimon In his recent post, Winklevoss named JP Morgan's CEO Jamie Dimon and wrote: 'My tweet from last week struck a nerve. This week, JPMorgan told us that because of it, they were pausing their re-onboarding of @Gemini as a customer after they off-boarded us during Operation ChokePoint 2.0. They want us to stay silent while they quietly try to take away your right to access YOUR banking data for free through third-party fintechs like @Plaid. Sorry, Jamie Dimon, we're not going to stay silent. We will continue to call out this anti-competitive, rent-seeking behaviour and immoral attempt to bankrupt fintech and crypto companies. We will never stop fighting for what is right! 🇺🇸' by Taboola by Taboola Sponsored Links Sponsored Links Promoted Links Promoted Links You May Like Up to 70% off | Libas Purple Days Sale Libas Undo Last week, sharing the report revealing JP Morgan's decision to charge fintechs for its customer details, he wrote: 'JPMorgan and the banksters are trying to kill fintech and crypto companies. They want to take away your right to access your banking data for FREE via-third party apps like @Plaid and instead charge you and fintechs exorbitant fees to access YOUR DATA. This will bankrupt fintechs that help you link your bank accounts to crypto companies like @Gemini, @coinbase, and @krakenfx so you can easily fund your account w/ fiat to buy bitcoin and crypto. As of today, the 'Open Banking Rule' developed pursuant to Section 1033 of the Consumer Financial Protection Act gives you the right to access you banking data via 3rd party apps. The banksters are suing the CFPB to vacate the Open Banking Rule and end the open banking era. This is the kind of egregious regulatory capture that kills innovation, hurts the American consumer, and is bad for America. Jamie Dimon and his cronies are trying to undercut President Trump's mandate to make America the pro innovation and the crypto capital of the world. We must fight back! 🇺🇸' 7 Reasons that make Samsung GALAXY Z FLIP7 different from others AI Masterclass for Students. Upskill Young Ones Today!– Join Now

Meta picks ex-OpenAI researcher over Yann LeCun as chief scientist of AI superintelligence unit
Meta picks ex-OpenAI researcher over Yann LeCun as chief scientist of AI superintelligence unit

India Today

time2 hours ago

  • India Today

Meta picks ex-OpenAI researcher over Yann LeCun as chief scientist of AI superintelligence unit

Meta has been stealing away talent from its biggest rivals as it gears up to become a serious contender in the AI race. And now, after putting together a team, CEO Mark Zuckerberg has announced that Shengjia Zhao, a former OpenAI researcher and one of the co-authors of the original ChatGPT paper, just been named Chief Scientist of Meta's newly minted Superintelligence AI who quietly joined Meta in June, was instrumental in OpenAI's early successes, from building the first reasoning model, o1, to helping shape ChatGPT itself. That o1 model, incidentally, is the very one that sparked the 'chain-of-thought' craze, later picked up by the likes of Google and CEO Mark Zuckerberg confirmed Zhao's promotion in a post on Threads, calling him 'our lead scientist from day one.' He went on to explain, 'Now that our recruiting is going well, and our team is coming together, we have decided to formalise his leadership role.' Zhao will now work directly under Alexandr Wang, the former Scale AI CEO who joined Meta in June as its Chief AI Officer. Wang has been tasked with nothing less than steering Meta towards artificial general intelligence (AGI), systems capable of reasoning and thinking at human or even superhuman Superintelligence Lab, which Meta unveiled in June 2025, is the centrepiece of this new ambition. It operates separately from FAIR, Meta's long-established AI research unit, which remains under the leadership of AI veteran Yann LeCun. LeCun now reports to Wang, giving the latter clear oversight of Meta's two-pronged AI research AI hiring binge has been making headlines all summer. In just a couple of months, the company has lured away more than a dozen leading researchers from OpenAI, Apple, Google and Anthropic. That includes high-profile names such as Apple AI scientists Tom Gunter and Mark high-stakes AI talent hunt didn't stop with Shengjia Zhao. As The Information first reported in June, Zhao joined alongside three other influential OpenAI researchers: Jiahui Yu, Shuchao Bi, and Hongyu Ren. The tech giant has also brought in Trapit Bansal, who previously collaborated with Zhao on AI reasoning models, and recruited a trio of engineers from OpenAI's Zurich office, each with expertise in make it happen, CEO Mark Zuckerberg has personally taken the reins of recruitment, reportedly reaching out to candidates via email and even hosting potential hires at his private Lake Tahoe retreat. Not your average job interview venue. According to the reports, some offers have been eye-wateringly generous, with compensation packages reaching eight and nine figures. And Meta isn't playing around, many of these are 'exploding offers,' designed to pressure top researchers into signing within just a few present, Meta's most advanced open-source model, LLaMA 4, is still lagging behind rivals like OpenAI's GPT4 and Google's Gemini. But the company is betting big on a new model, codenamed 'Behemoth,' which is expected to debut later this year. Zuckerberg sounded bullish about what's coming next. 'Together we are building an elite, talent-dense team that has the resources and long-term focus to push the frontiers of superintelligence research,' he Zhao now in the driver's seat and a growing team of AI heavyweights, Meta clearly wants to make sure that the next big breakthrough in AI has its logo stamped all over it.- EndsMust Watch

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store