logo
Apple researchers say models like ChatGPT o3 look smart but collapse when faced with real complexity

Apple researchers say models like ChatGPT o3 look smart but collapse when faced with real complexity

India Today09-06-2025
They may talk the talk, but can they truly think it through? A new study by Apple researchers suggests that even the most advanced AI models like ChatGPT o3, Claude, and DeepSeek start to unravel when the going gets tough. These so-called 'reasoning' models may impress with confident answers and detailed explanations, but when faced with genuinely complex problems, they stumble – and sometimes fall flat. advertisementApple researchers have found that the most advanced large language models today may not be reasoning in the way many believe. In a recently released paper titled The Illusion of Thinking, researchers at Apple show that while these models appear intelligent on the surface, their performance dramatically collapses when they are faced with truly complex problems.The study looked at a class of models now referred to as Large Reasoning Models (LRMs), which are designed to "think" through complex tasks using a series of internal steps, often called a 'chain of thought.' This includes models like OpenAI's o3, DeepSeek-R1, and Claude 3.7 Sonnet Thinking. Apple's researchers tested how these models handle problems of increasing difficulty – not just whether they arrive at the correct answer, but how they reason their way there.advertisement
The findings were striking. As problem complexity rose, the models' performance did not apparently degrade gracefully – it collapsed completely. 'They think more up to a point,' tweeted tech critique Josh Wolfe, referring to the findings. 'Then they give up early, even when they have plenty of compute left.'
Apple's team built custom puzzle environments such as the Tower of Hanoi, River Crossing, and Blocks World to carefully control complexity levels. These setups allowed them to observe not only whether the models found the right answer, but how they tried to get there.They found that:-At low complexity, traditional LLMs (without reasoning chains) performed better and were more efficient-At medium complexity, reasoning models briefly took the lead-At high complexity, both types failed completelyEven when given a step-by-step algorithm for solving a problem, so that they only needed to follow instructions, models still made critical mistakes. This suggests that they struggle not only with creativity or problem-solving, but with basic logical execution. The models also showed odd behaviour when it came to how much effort they put in. Initially, they 'thought' more as the problems got harder, using more tokens for reasoning steps. But once a certain threshold was reached, they abruptly started thinking less. This happened even when they hadn't hit any computational limits, highlighting what Apple calls a 'fundamental inference time scaling limitation.'advertisementCognitive scientist Gary Marcus said the paper supports what he's been arguing for decades: these systems don't generalise well beyond their training data. 'Neural networks can generalise within a training distribution of data they are exposed to, but their generalisation tends to break down outside that distribution,' Marcus wrote on Substack. He also noted that the models' 'reasoning traces' – the steps they take to reach an answer – can look convincing, but often don't reflect what the models actually did to reach a conclusion.Arizona State University's Subbarao (Rao) Kambhampati, whose previous work has critiqued so-called reasoning models, was also echoed in Apple's findings, points out Marcus. Rao has shown that models often appear to think logically but actually produce answers that don't match their thought process. Apple's experiments back this up by showing models generate long reasoning paths that still lead to the wrong answer, particularly as problems get harder.advertisementPerhaps the most damning evidence came when Apple tested whether models could follow exact instructions. In one test, they were handed the algorithm to solve the Tower of Hanoi puzzle and asked to just execute it. The models still failed once the puzzle complexity passed a certain point.Apple's conclusion is blunt: today's top models are 'super expensive pattern matchers' that can mimic reasoning only within familiar settings. The moment they're faced with novel problems – ones just outside their training data – they crumble.These findings have serious implications for claims that AI is becoming capable of human-like reasoning. As the paper puts it, the current approach may be hitting a wall, and overcoming it could require an entirely different way of thinking about how we build intelligent systems. In short, we are still leaps away from AGI.
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Apple's iPhone 17 Pro Max Leaks Hint at Thicker, Battery-Centric Design Shift
Apple's iPhone 17 Pro Max Leaks Hint at Thicker, Battery-Centric Design Shift

Hans India

timean hour ago

  • Hans India

Apple's iPhone 17 Pro Max Leaks Hint at Thicker, Battery-Centric Design Shift

Apple may be rewriting the rules of premium smartphone design with its upcoming iPhone 17 Pro Max. According to recent leaks, the tech giant is moving away from its obsession with thinness — a signature feature for over a decade — and embracing a bolder, battery-first design. The iPhone 17 Pro Max is expected to feature a body thickness of 8.725mm, a notable increase from the 8.25mm thickness of the iPhone 16 Pro Max. While this may seem like a minor physical change, the implication is major: more room for a larger battery, and with it, significantly improved endurance. This marks the first Pro Max model since the iPhone X era to actively prioritize battery capacity over aesthetic minimalism. Apple's shift reflects a growing consumer preference for longer battery life over sleek profiles. The regular iPhone 17 Pro, however, will reportedly retain the same size and battery as its predecessor, making the Max variant stand out not just for its screen size or price — but now, for power and practicality. Currently, the iPhone 16 Pro Max offers up to 33 hours of video playback and 105 hours of audio. With the increased chassis size, the iPhone 17 Pro Max could push those limits even further, potentially becoming Apple's longest-lasting iPhone ever — and a serious contender across the premium smartphone market. Interestingly, this year's leaks do not indicate any major camera exclusives for the Pro Max — a departure from previous years where zoom capabilities and stabilization features set it apart. In 2025, Apple appears to be betting big on battery life as the headline feature. If these reports hold true, Apple is ushering in a subtle yet significant design shift — one that trades featherlight finesse for functional longevity. In a market crowded with ultra-thin, power-hungry devices, the iPhone 17 Pro Max may redefine what a flagship smartphone should be: powerful, practical, and built to last.

iPhone 17 Pro And 17 Pro Max Launching In 2025: These 5 Upgrades Could Make Them Worthy
iPhone 17 Pro And 17 Pro Max Launching In 2025: These 5 Upgrades Could Make Them Worthy

News18

timean hour ago

  • News18

iPhone 17 Pro And 17 Pro Max Launching In 2025: These 5 Upgrades Could Make Them Worthy

Last Updated: iPhone 17 Pro series are widely tipped to get major upgrades in design, camera and possibly the battery size. iPhone 17 Pro series is expected to launch in the next few months. Yes, we're almost there in 2025 when the big Apple event is held to announce the new iPhones. The latest Pro models will understandably carry some notable upgrades, and some design changes in the offing, at least going by the recent leaks. But when people pay a premium they expect the iPhone Pro to blow their mind with the camera features, and reports say Apple could bring as many as four new options for the shutterbugs. And if you are looking at the iPhone 17 Pro as a whole, we could have 5 big upgrades that are noteworthy and make the new iPhones exciting. Some Design Changes The iPhone 17 Pro design could be more durable than its predecessors as Apple is tipped to use aluminium instead of titanium for the chassis. The company has been doing well in the recent bend tests, and this new upgrade could play a big role in its strong body. The back of the iPhone 17 Pro series could see a mix of aluminium and glass material finish which could add style and substance to the models. The camera layout is also going to be part of the design change, with more room given for the triple sensors and some distance between the LED flash at the back. New Pro Hardware Apple will be offering the new Pro-level chipset for the iPhone 17 Pro series. Last year we got the A18 Pro version and now it is time to see the A19 Pro SoC powering the new Pro lineup for the company in 2025. TSMC will be using the 3nm process to manufacture the new hardware for the iPhones and going by the performance levels for the A18 Pro, you can definitely expect a major bump this year as well. Time For Telephotos Apple is rumoured to bring a new 48MP telephoto lens which should be a notable upgrade on the 12MP lens that the iPhone 16 Pro models got last year. This means Apple could bring a full set of 48MP sensors for its triple camera system with this year's iPhone 17 Pro and 17 Pro Max models. That's not all, we could also see the iPhone 17 Pro models get support for variable aperture with the new lens, also mentioned in multiple leaks recently. The New Selfie Game It is not just the back of the iPhone 17 Pro where we could see big upgrades, the front is also overdue for a new setup. And reports have widely talked about a new 24MP shooter packed under the notch with the 17 Pro models. The Pro models have always looked to stand out from the regular versions and for their top price, they have needed those upgrades. It is high time that Apple changes its focus on the front camera and really goes for the juggernaut which could happen this year. And with these changes, iPhone 17 Pros could offer dual-cam recording for the first time. The Big Battery Push Apple is likely to ditch its focus on making the iPhone 17 Pro models slim which could have a big impact on the size of the battery packing inside these devices. Now we can't say for certain about these figures, and Apple does its best to keep them a secret but we are hoping to see the mAh numbers to be bigger than what the iPhone 16 Pro and 16 Pro Max offered. Will that also be complemented with faster charging speeds? We have been hearing these rumours for years but Apple does not bend, hopefully this year it changes. First Published: July 03, 2025, 07:30 IST

Nvidia sets new record, leaves Apple and Microsoft behind to become first company in history to achieve this milestone
Nvidia sets new record, leaves Apple and Microsoft behind to become first company in history to achieve this milestone

Time of India

timean hour ago

  • Time of India

Nvidia sets new record, leaves Apple and Microsoft behind to become first company in history to achieve this milestone

Nvidia has touched a staggering market value of $3.92 trillion on Thursday (July 3), positioning itself to become the most valuable company in history. This surge is fueled by Wall Street's optimism surrounding artificial intelligence (AI). The high-end AI chipmaker pushed its market capitalisation beyond Apple's previous record closing value of $3.915 trillion, set on December 26, 2024. According to news agency Reuters, the company's newest chips have been instrumental in training the largest AI models, driving an insatiable demand for its products. Currently, Microsoft holds the second spot with a market capitalization of $3.7 trillion, and Apple now sits in third place, with a market value of $3.19 trillion. by Taboola by Taboola Sponsored Links Sponsored Links Promoted Links Promoted Links You May Like Canadians Can Receive This Benefit in 2025 Canada Today Undo How Nvidia – not Google and Microsoft – is winning the AI race The competition among tech giants like Microsoft, Amazon, Meta Platforms, Alphabet (Google) and Tesla to build advanced AI data centers and lead in emerging AI technology has directly translated into an increased demand for Nvidia's specialised chips. Nvidia, whose core technology was initially developed for video games, has seen its stock market value skyrocket nearly eight-fold over the past four years, growing from $500 billion in 2021 to nearly $4 trillion. Nvidia's market cap is more than that of all publicly listed companies in Canada and Mexico combined Cirting LSEG data, the report says that Nvidia's current valuation surpasses the combined market cap of all publicly listed companies in both Canada and Mexico, and even exceeds the total value of all publicly traded companies in the United Kingdom. Nvidia's stock has also registered a powerful rebound, climbing over 68% from its recent low on April 4. That dip occurred when Wall Street reacted to US President Donald Trump's global tariff announcements. US stocks, including Nvidia, have since recovered on expectations that the White House will finalise trade deals to mitigate the impact of Trump's tariffs. OnePlus Nord 5 and OnePlus Nord CE 5: Unboxing and first look AI Masterclass for Students. Upskill Young Ones Today!– Join Now

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store