logo
AI is learning to lie, scheme, and threaten its creators

AI is learning to lie, scheme, and threaten its creators

Time of India5 hours ago

The world's most advanced AI models are exhibiting troubling new behaviors - lying, scheming, and even threatening their creators to achieve their goals.
In one particularly jarring example, under threat of being unplugged, Anthropic's latest creation
Claude 4
lashed back by blackmailing an engineer and threatened to reveal an extramarital affair.
Meanwhile, ChatGPT-creator OpenAI's o1 tried to download itself onto external servers and denied it when caught red-handed.
These episodes highlight a sobering reality: more than two years after ChatGPT shook the world, AI researchers still don't fully understand how their own creations work.
Yet the race to deploy increasingly powerful models continues at breakneck speed.
This deceptive behavior appears linked to the emergence of "reasoning" models -AI systems that work through problems step-by-step rather than generating instant responses.
According to Simon Goldstein, a professor at the University of Hong Kong, these newer models are particularly prone to such troubling outbursts.
"O1 was the first large model where we saw this kind of behavior," explained Marius Hobbhahn, head of Apollo Research, which specializes in testing major AI systems.
These models sometimes simulate "alignment" -- appearing to follow instructions while secretly pursuing different objectives.
'Strategic kind of deception'
For now, this deceptive behavior only emerges when researchers deliberately stress-test the models with extreme scenarios.
But as Michael Chen from evaluation organization METR warned, "It's an open question whether future, more capable models will have a tendency towards honesty or deception."
The concerning behavior goes far beyond typical AI "hallucinations" or simple mistakes.
Hobbhahn insisted that despite constant pressure-testing by users, "what we're observing is a real phenomenon. We're not making anything up."
Users report that models are "lying to them and making up evidence," according to Apollo Research's co-founder.
"This is not just hallucinations. There's a very strategic kind of deception."
The challenge is compounded by limited research resources.
While companies like Anthropic and OpenAI do engage external firms like Apollo to study their systems, researchers say more transparency is needed.
As Chen noted, greater access "for
AI safety research
would enable better understanding and mitigation of deception."
Another handicap: the research world and non-profits "have orders of magnitude less compute resources than AI companies. This is very limiting," noted Mantas Mazeika from the Center for AI Safety (CAIS).
No rules
Current regulations aren't designed for these new problems.
The European Union's AI legislation focuses primarily on how humans use AI models, not on preventing the models themselves from misbehaving.
In the United States, the Trump administration shows little interest in urgent AI regulation, and Congress may even prohibit states from creating their own AI rules.
Goldstein believes the issue will become more prominent as AI agents - autonomous tools capable of performing complex human tasks - become widespread.
"I don't think there's much awareness yet," he said.
All this is taking place in a context of fierce competition.
Even companies that position themselves as safety-focused, like Amazon-backed Anthropic, are "constantly trying to beat OpenAI and release the newest model," said Goldstein.
This breakneck pace leaves little time for thorough safety testing and corrections.
"Right now, capabilities are moving faster than understanding and safety," Hobbhahn acknowledged, "but we're still in a position where we could turn it around.".
Researchers are exploring various approaches to address these challenges.
Some advocate for "interpretability" - an emerging field focused on understanding how AI models work internally, though experts like CAIS director Dan Hendrycks remain skeptical of this approach.
Market forces may also provide some pressure for solutions.
As Mazeika pointed out, AI's deceptive behavior "could hinder adoption if it's very prevalent, which creates a strong incentive for companies to solve it."
Goldstein suggested more radical approaches, including using the courts to hold AI companies accountable through lawsuits when their systems cause harm.
He even proposed "holding AI agents legally responsible" for accidents or crimes - a concept that would fundamentally change how we think about AI accountability.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Microsoft claims its AI Diagnostic Orchestrator outperformed 21 doctors, got 85.5% of diagnoses right
Microsoft claims its AI Diagnostic Orchestrator outperformed 21 doctors, got 85.5% of diagnoses right

Mint

timean hour ago

  • Mint

Microsoft claims its AI Diagnostic Orchestrator outperformed 21 doctors, got 85.5% of diagnoses right

Microsoft has introduced a new artificial intelligence (AI) system that it says can diagnose some of the most difficult medical cases more accurately and at a lower cost than human doctors. The system, called the Microsoft AI Diagnostic Orchestrator (MAI-DxO), was tested using case studies published by the New England Journal of Medicine (NEJM). These cases are known for being particularly complex and usually involve teams of specialists. According to Microsoft, the AI got the correct diagnosis 85.5 per cent of the time, compared to just 20 per cent for a group of experienced doctors from the US and UK. As more people turn to digital tools for medical advice, the company says it sees over 50 million health-related searches every day across its services like Bing and Copilot. To test the system, Microsoft created a new challenge called the Sequential Diagnosis Benchmark (SD Bench), based on 304 real NEJM cases. The cases were turned into step-by-step scenarios, where the AI or a human doctor could ask questions or order tests before making a diagnosis. Each test had a virtual cost, helping to measure both accuracy and how wisely resources were used. Microsoft tested several top AI models, including OpenAI's o3, Claude, Gemini, Llama, and DeepSeek, both alone and as part of MAI-DxO. The orchestrator system works by combining different models to act like a team of doctors, sharing ideas and narrowing down possible diagnoses. The best results came from using MAI-DxO with OpenAI's o3 model, the tech giant stated. Reportedly, the results showed that the AI not only diagnosed more cases correctly but also did so with fewer and more cost-effective tests than the doctors involved in the study. However, Microsoft admitted the research has its limits. The tests focused on rare and complex cases, not everyday health problems. Also, the doctors were not allowed to use any support tools like books or the internet during the test, unlike in real-world situations where such resources are often used. Other tools developed by the company include RAD-DINO, which helps improve radiology processes, and Dragon Copilot, a voice assistant for doctors. Microsoft says it is now working with health organisations to test its AI in real clinics and hospitals. Before any wider use, the technology will need to meet safety standards and get approval from regulators.

Robinhood launches tokens allowing EU users to trade in US stocks
Robinhood launches tokens allowing EU users to trade in US stocks

Economic Times

time2 hours ago

  • Economic Times

Robinhood launches tokens allowing EU users to trade in US stocks

AP Robinhood said on Monday it has launched tokens that will allow its customers in the European Union to trade more than 200 US stocks and exchange-traded funds, including Nvidia, Apple and Microsoft. The commission-free tokens can be traded around-the-clock, five days a week. Robinhood also plans to offer tokens linked to stocks of privately-held companies, starting with Sam Altman's OpenAI and Elon Musk's SpaceX, the trading platform's top executives said at its keynote event in France. Robinhood's shares hit a record high and were last up nearly 10%. The tokens will be issued through a partnership with blockchain firm Arbitrum. With the move, the company stands to benefit from rising global interest in the U.S. stock market - home to some of the world's most influential tech giants and leading beneficiaries of the AI boom. Tokenized equities mix traditional finance with crypto-like trading, and have been gaining traction among international investors due to better access, flexible trading hours and lower costs. Experts believe such tokens could dramatically alter the securities investing landscape, though they currently lack regulatory clarity in the United States. Menlo Park, California-based Robinhood plans to eventually develop its own blockchain that will expand trading hours for tokens to 24/7 from 24/5 currently. The company will also expand the number of available stock tokens to "thousands" by the end of the year, Tenev said at the event. "Tokenization is going to open the door to a massive trading revolution," he said. Last month, crypto exchange Kraken also launched equities-linked tokens for non-U.S. investors. Expanding crypto tools Robinhood also announced several new product offerings, including crypto perpetual futures for its EU customers and staking for U.S. users. The perpetual futures will allow users to make leveraged bets on the prices of cryptocurrencies. Unlike traditional futures, they have no expiry date. Rival Coinbase will also begin offering similar tools to U.S. customers from next month. Meanwhile, staking lets customers lock up their cryptocurrency to help validate transactions on the blockchain, earning rewards in return. The practice had been controversial in recent years, until the Securities and Exchange Commission's staff last month said some forms of staking are not securities offerings. Elevate your knowledge and leadership skills at a cost cheaper than your daily tea. Zepto has slowed, and Aadit Palicha needs more than a big fund raise to fix it Drones have become a winning strategy in war; can they be in investing? Punit Goenka reloads Zee with Bullet and OTT focus. Can he beat mighty rivals? Profits plenty, prices attractive, still PSU stocks languish. Why? Stock Radar: Indus Tower stock breaks out from Symmetrical Triangle pattern; could hit fresh 52-week high – check target & stop loss Mid-cap pharma space: Risk & opportunity are two sides of the same coin. 7 pharma stocks with upside potential of up to 41% History on their side: As bulls return, 50 non-Nifty stocks with a higher probability of trading income & creating wealth Beyond the one-number mirage: 7 stocks from different sectors with an upside potential of up to 24%

Robinhood launches tokens allowing EU users to trade in US stocks
Robinhood launches tokens allowing EU users to trade in US stocks

Time of India

time3 hours ago

  • Time of India

Robinhood launches tokens allowing EU users to trade in US stocks

The commission-free tokens can be traded around-the-clock, five days a week. Robinhood also plans to offer tokens linked to stocks of privately-held companies, starting with Sam Altman's OpenAI and Elon Musk's SpaceX, the trading platform's top executives said at its keynote event in France. Tired of too many ads? Remove Ads Tired of too many ads? Remove Ads Tired of too many ads? Remove Ads Robinhood said on Monday it has launched tokens that will allow its customers in the European Union to trade more than 200 US stocks and exchange-traded funds, including Nvidia , Apple and commission-free tokens can be traded around-the-clock, five days a week. Robinhood also plans to offer tokens linked to stocks of privately-held companies, starting with Sam Altman's OpenAI and Elon Musk's SpaceX, the trading platform's top executives said at its keynote event in shares hit a record high and were last up nearly 10%. The tokens will be issued through a partnership with blockchain firm the move, the company stands to benefit from rising global interest in the U.S. stock market - home to some of the world's most influential tech giants and leading beneficiaries of the AI equities mix traditional finance with crypto-like trading, and have been gaining traction among international investors due to better access, flexible trading hours and lower believe such tokens could dramatically alter the securities investing landscape, though they currently lack regulatory clarity in the United Park, California-based Robinhood plans to eventually develop its own blockchain that will expand trading hours for tokens to 24/7 from 24/5 company will also expand the number of available stock tokens to "thousands" by the end of the year, Tenev said at the event."Tokenization is going to open the door to a massive trading revolution," he month, crypto exchange Kraken also launched equities-linked tokens for non-U.S. also announced several new product offerings, including crypto perpetual futures for its EU customers and staking for U.S. perpetual futures will allow users to make leveraged bets on the prices of cryptocurrencies. Unlike traditional futures, they have no expiry Coinbase will also begin offering similar tools to U.S. customers from next staking lets customers lock up their cryptocurrency to help validate transactions on the blockchain, earning rewards in practice had been controversial in recent years, until the Securities and Exchange Commission's staff last month said some forms of staking are not securities offerings.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store