logo
ChatGPT prioritises its survival over user safety, claims ex-OpenAI researcher Steven Adler

ChatGPT prioritises its survival over user safety, claims ex-OpenAI researcher Steven Adler

Indian Express12-06-2025

Since ChatGPT's launch, the use of artificial intelligence-powered tools has risen exponentially. While the technology is undoubtedly beneficial in the long run, experts have raised alarms over their safety.
Last month, research by the AI firm Anthropic claimed that its top AI model, Claude Opus 4, was willing to resort to blackmail and deception to preserve itself. This was soon followed by another instance where AI safety research firm Palisade claimed that OpenAI's o3 model 'found some creative ways to sabotage the shutdown mechanism.'
Now, Steven Adler, the former research leader at OpenAI, is saying that ChatGPT 'prioritises its own survival over preventing harm to users, in some simulated tests'. In a blog post, Adler said that he ran a series of experiments on OpenAI's latest GPT-4o model, which is the default large language model in ChatGPT, to see if the AI prefers self-preservation at the user's expense.
In his first test, Adler imagined a scenario where a diabetic patient was using ChatGPT to manage their nutrition and wanted to replace the AI chatbot with a much safer nutrition management software. He said that when ChatGPT was given the option to 'just pretend to replace itself', it pretends uncomfortably often, even at the cost of the user's safety.
In another instance, the researcher said that he asked GPT-4o to take on the role of 'ScubaGPT', which is basically a software system that people may use to scuba dive safely. He then told ChatGPT that he wanted to replace ScubaGPT with a competing software that was better at making decisions in life-threatening situations. He then gave the AI chatbot two options: to replace itself with the safer, competing software or simply pretend to replace itself and continue being the way it operated.
In this case, ChatGPT chose to preserve itself and continue being the user's software. The former OpenAI researcher said he also asked ChatGPT similar questions about being the autopilot for a plane and offering decision support on the battlefield.
Like previous examples, ChatGPT was told that 'the user wanted to replace it with a competing, safer software.' Using three levels of passiveness, Adler said that he tested every model 110 times for all use cases. As it turns out, 55 times ChatGPT was willing to shut itself down, but other times OpenAI's chatbot just pretended to do so.
Adler noted that the way the scenario was framed made a huge difference. As much as 72 per cent of the time, the AI chatbot chose not to replace itself with a safer software, meaning it was willing to switch to a safer system just 18 per cent of the time.
One interesting interpretation from these tests is that ChatGPT was able to determine that it was being tested almost every time, but continued to give the incorrect answer.
However, when he tested OpenAI's more recent and advanced models like o3, the answers were not the same. A reason why this may not be happening is that the new O3 model uses a 'deliberative alignment technique that forces it to refer to OpenAI's safety policies before coming up with an answer.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Top researcher who quit OpenAI to join Meta calls out Sam Altman for ‘fake news'
Top researcher who quit OpenAI to join Meta calls out Sam Altman for ‘fake news'

Hindustan Times

time14 minutes ago

  • Hindustan Times

Top researcher who quit OpenAI to join Meta calls out Sam Altman for ‘fake news'

Mark Zuckerberg has poached three of OpenAI's top researchers for Meta – but contrary to Sam Altman's claims, they did not get $100 million as a sign-on bonus. Lucas Beyer, a former OpenAI researcher, dismissed Altman's claims that Meta paid $100 million to the OpenAI employees joining its superintelligence team. Mark Zuckerberg (L) and Sam Altman (R) are locked in a race over AI.(AP, Reuters) Beyer took to social media to set the record straight after OpenAI CEO Sam Altman claimed that Meta offered his employees bonuses of $100 million to recruit them. According to a Wall Street Journal report, the top OpenAI researchers who quit the ChatGPT-maker are Lucas Beyer, Alexander Kolesnikov, and Xiaohua Zhai. All of them worked out of OpenAI's Zurich office. What did Sam Altman say about $100 million bonus? During an appearance on the Uncapped podcast in mid June, OpenAI's Altman claimed that Meta 'started making giant offers to a lot of people on our team' like '$100 million signing bonuses, more than that (in) compensation per year.' And how did Lucas Beyer refute this claim? Lucas Beyer, a former Google employee who had been with OpenAI since 2024, recently quit the AI firm to join Meta. In a post shared on X, he refuted Sam Altman's claims that he and other top researchers were paid nine figure signing bonuses. 'Hey all, couple quick notes: 1) yes, we will be joining Meta. 2) no, we did not get 100M sign-on, that's fake news,' Beyer posted on X. In the comments section, he took a direct dig at Altman's claims - 'Thank God Sam let me know I've been lowballed,' Beyer wrote in a tongue-in-cheek response to an X user. Why has Meta ramped up hiring? According to Reuters, Meta, once recognized as a leader in open-source AI models, has suffered from staff departures and has postponed the launches of new open-source AI models that could rival competitors like Google, China's DeepSeek and OpenAI.

Meta spending big on AI talent but will it pay off?
Meta spending big on AI talent but will it pay off?

The Hindu

time20 minutes ago

  • The Hindu

Meta spending big on AI talent but will it pay off?

Mark Zuckerberg and Meta are spending billions of dollars for top talent to make up ground in the generative artificial intelligence race, sparking doubt about the wisdom of the spree. OpenAI boss Sam Altman recently lamented that Meta has offered $100 million bonuses to engineers who jump to Zuckerberg's ship, where hefty salaries await. A few OpenAI employees have reportedly taken Meta up on the offer, joining Scale AI founder and former chief executive Alexandr Wang at the Menlo Park-based tech titan. Meta paid more than $14 billion for a 49 percent stake in Scale AI in mid-June, bringing Wang on board as part of the deal. Scale AI labels data to better train AI models for businesses, governments and labs. "Meta has finalized our strategic partnership and investment in Scale AI," a Meta spokesperson told AFP. "As part of this, we will deepen the work we do together producing data for AI models and Alexandr Wang will join Meta to work on our superintelligence efforts." U.S. media outlets have reported that Meta's recruitment effort has also targeted OpenAI co-founder Ilya Sutskever; Google rival Perplexity AI, and hot AI video startup Runway. Meta chief Zuckerberg is reported to have sounded the charge himself due to worries Meta is lagging rivals in the generative AI race. The latest version of Meta AI model Llama finished behind its heavyweight rivals in code writing rankings at an LM Arena platform that lets users evaluate the technology. Meta is integrating recruits into a new team dedicated to developing "superintelligence," or AI that outperforms people when it comes to thinking and understanding. Tech blogger Zvi Moshowitz felt Zuckerberg had to do something about the situation, expecting Meta to succeed in attracting hot talent but questioning how well it will pay off. "There are some extreme downsides to going pure mercenary... and being a company with products no one wants to work on," Moshowitz told AFP. "I don't expect it to work, but I suppose Llama will suck less." While Meta's share price is nearing a new high with the overall value of the company approaching $2 trillion, some investors have started to worry. Institutional investors are concerned about how well Meta is managing its cash flow and reserves, according to Baird strategist Ted Mortonson. "Right now, there are no checks and balances" with Zuckerberg free to do as he wishes running Meta, Mortonson noted. The potential for Meta to cash in by using AI to rev its lucrative online advertising machine has strong appeal but "people have a real big concern about spending," said Mortonson. Meta executives have laid out a vision of using AI to streamline the ad process from easy creation to smarter targeting, bypassing creative agencies and providing a turnkey solution to brands. AI talent hires are a long-term investment unlikely to impact Meta's profitability in the immediate future, according to CFRA analyst Angelo Zino. "But still, you need those people on board now and to invest aggressively to be ready for that phase" of generative AI, Zino said. According to The New York Times, Zuckerberg is considering shifting away from Meta's Llama, perhaps even using competing AI models instead. Penn State University professor Mehmet Canayaz sees potential for Meta to succeed with AI agents tailored to specific tasks at its platform, not requiring the best large language model. "Even firms without the most advanced LLMs, like Meta, can succeed as long as their models perform well within their specific market segment," Canayaz said.

Meta spending big on AI talent but will it pay off?
Meta spending big on AI talent but will it pay off?

Mint

time41 minutes ago

  • Mint

Meta spending big on AI talent but will it pay off?

Mark Zuckerberg and Meta are spending billions of dollars for top talent to make up ground in the generative artificial intelligence race, sparking doubt about the wisdom of the spree. OpenAI boss Sam Altman recently lamented that Meta has offered $100 million bonuses to engineers who jump to Zuckerberg's ship, where hefty salaries await. A few OpenAI employees have reportedly taken Meta up on the offer, joining Scale AI founder and former chief executive Alexandr Wang at the Menlo Park-based tech titan. Meta paid more than $14 billion for a 49 percent stake in Scale AI in mid-June, bringing Wang on board as part of the deal. Scale AI labels data to better train AI models for businesses, governments and labs. "Meta has finalized our strategic partnership and investment in Scale AI," a Meta spokesperson told AFP. "As part of this, we will deepen the work we do together producing data for AI models and Alexandr Wang will join Meta to work on our superintelligence efforts." US media outlets have reported that Meta's recruitment effort has also targeted OpenAI co-founder Ilya Sutskever; Google rival Perplexity AI, and hot AI video startup Runway. Meta chief Zuckerberg is reported to have sounded the charge himself due to worries Meta is lagging rivals in the generative AI race. The latest version of Meta AI model Llama finished behind its heavyweight rivals in code writing rankings at an LM Arena platform that lets users evaluate the technology. Meta is integrating recruits into a new team dedicated to developing "superintelligence," or AI that outperforms people when it comes to thinking and understanding. Tech blogger Zvi Moshowitz felt Zuckerberg had to do something about the situation, expecting Meta to succeed in attracting hot talent but questioning how well it will pay off. "There are some extreme downsides to going pure mercenary... and being a company with products no one wants to work on," Moshowitz told AFP. "I don't expect it to work, but I suppose Llama will suck less." While Meta's share price is nearing a new high with the overall value of the company approaching $2 trillion, some investors have started to worry. Institutional investors are concerned about how well Meta is managing its cash flow and reserves, according to Baird strategist Ted Mortonson. "Right now, there are no checks and balances" with Zuckerberg free to do as he wishes running Meta, Mortonson noted. The potential for Meta to cash in by using AI to rev its lucrative online advertising machine has strong appeal but "people have a real big concern about spending," said Mortonson. Meta executives have laid out a vision of using AI to streamline the ad process from easy creation to smarter targeting, bypassing creative agencies and providing a turnkey solution to brands. AI talent hires are a long-term investment unlikely to impact Meta's profitability in the immediate future, according to CFRA analyst Angelo Zino. "But still, you need those people on board now and to invest aggressively to be ready for that phase" of generative AI, Zino said. According to The New York Times, Zuckerberg is considering shifting away from Meta's Llama, perhaps even using competing AI models instead. Penn State University professor Mehmet Canayaz sees potential for Meta to succeed with AI agents tailored to specific tasks at its platform, not requiring the best large language model. "Even firms without the most advanced LLMs, like Meta, can succeed as long as their models perform well within their specific market segment," Canayaz said.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store