
AI is learning to lie, scheme, and threaten its creators
Academy
Empower your mind, elevate your skills
The world's most advanced AI models are exhibiting troubling new behaviors - lying, scheming, and even threatening their creators to achieve their goals.In one particularly jarring example, under threat of being unplugged, Anthropic's latest creation Claude 4 lashed back by blackmailing an engineer and threatened to reveal an extramarital affair.Meanwhile, ChatGPT-creator OpenAI's o1 tried to download itself onto external servers and denied it when caught red-handed.These episodes highlight a sobering reality: more than two years after ChatGPT shook the world, AI researchers still don't fully understand how their own creations work.Yet the race to deploy increasingly powerful models continues at breakneck speed.This deceptive behavior appears linked to the emergence of "reasoning" models -AI systems that work through problems step-by-step rather than generating instant responses.According to Simon Goldstein, a professor at the University of Hong Kong, these newer models are particularly prone to such troubling outbursts."O1 was the first large model where we saw this kind of behavior," explained Marius Hobbhahn, head of Apollo Research, which specializes in testing major AI systems.These models sometimes simulate "alignment" -- appearing to follow instructions while secretly pursuing different objectives.For now, this deceptive behavior only emerges when researchers deliberately stress-test the models with extreme scenarios.But as Michael Chen from evaluation organization METR warned, "It's an open question whether future, more capable models will have a tendency towards honesty or deception."The concerning behavior goes far beyond typical AI "hallucinations" or simple mistakes.Hobbhahn insisted that despite constant pressure-testing by users, "what we're observing is a real phenomenon. We're not making anything up."Users report that models are "lying to them and making up evidence," according to Apollo Research's co-founder."This is not just hallucinations. There's a very strategic kind of deception."The challenge is compounded by limited research resources.While companies like Anthropic and OpenAI do engage external firms like Apollo to study their systems, researchers say more transparency is needed.As Chen noted, greater access "for AI safety research would enable better understanding and mitigation of deception."Another handicap: the research world and non-profits "have orders of magnitude less compute resources than AI companies. This is very limiting," noted Mantas Mazeika from the Center for AI Safety (CAIS).Current regulations aren't designed for these new problems.The European Union's AI legislation focuses primarily on how humans use AI models, not on preventing the models themselves from misbehaving.In the United States, the Trump administration shows little interest in urgent AI regulation, and Congress may even prohibit states from creating their own AI rules.Goldstein believes the issue will become more prominent as AI agents - autonomous tools capable of performing complex human tasks - become widespread."I don't think there's much awareness yet," he said.All this is taking place in a context of fierce competition.Even companies that position themselves as safety-focused, like Amazon-backed Anthropic, are "constantly trying to beat OpenAI and release the newest model," said Goldstein.This breakneck pace leaves little time for thorough safety testing and corrections."Right now, capabilities are moving faster than understanding and safety," Hobbhahn acknowledged, "but we're still in a position where we could turn it around.".Researchers are exploring various approaches to address these challenges.Some advocate for "interpretability" - an emerging field focused on understanding how AI models work internally, though experts like CAIS director Dan Hendrycks remain skeptical of this approach.Market forces may also provide some pressure for solutions.As Mazeika pointed out, AI's deceptive behavior "could hinder adoption if it's very prevalent, which creates a strong incentive for companies to solve it."Goldstein suggested more radical approaches, including using the courts to hold AI companies accountable through lawsuits when their systems cause harm.He even proposed "holding AI agents legally responsible" for accidents or crimes - a concept that would fundamentally change how we think about AI accountability.
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Mint
26 minutes ago
- Mint
Productivity puzzle: Solow's paradox has come to haunt AI adoption
AI enthusiasts, beware: predictions that the technology will suddenly boost productivity eerily echo those that had followed the introduction of computers to the workplace. Back then, we were told that the miraculous new machines would automate vast swathes of white-collar work, leading to a lean, digital-driven economy. Fast forward 60 years, and it's more of the same. Shortly after the debut of ChatGPT in 2022, researchers at the Massachusetts Institute of Technology claimed employees would be 40% more productive than their AI-less counterparts. These claims may prove to be no more durable than the pollyannish predictions of the Mad Men era. A rigorous study published by the National Bureau of Economic Research in May found only a 3% boost in time saved, while other studies have shown that reliance on AI for high-level cognitive work leads to less motivated, impaired employees. We are witnessing the makings of another 'productivity paradox,' the term coined to describe how productivity unexpectedly stagnated and, in some cases, declined during the first four decades of the information age. The bright side is that the lessons learned then might help us navigate our expectations in the present day. The invention of transistors, integrated circuits, memory chips and microprocessors fuelled exponential improvements in information technology from the 1960s onward, with computers reliably doubling in power roughly every two years with almost no increase in cost. It quickly became an article of faith that computers would lead to widespread automation (and structural unemployment). A single person armed with the device could handle work that previously required hundreds of employees. Over the next three decades, the service sector decisively embraced computers. Yet, the promised gains did not materialize. In fact, studies from the late 1980s revealed that the services sector—what economist Stephen Roach described as 'the most heavily endowed with high-tech capital"—registered the worst productivity performance during this same period. In response, economist Robert Solow had famously quipped that 'we see computers everywhere except in the productivity statistics." Economists advanced multiple explanations for this puzzle (also known as 'Solow's Paradox'). Least satisfying, perhaps, was the claim, still made today, that the whole thing was a mirage of mismeasurement and that the effects of massive automation somehow failed to show up in the economic data. Others have argued that the failure of infotech investments to live up to the hype can be laid at the feet of managers. There's some merit to this argument: studies of infotech adoption have shown that bosses spent indiscriminately on new equipment, all while hiring expensive workers charged with maintaining and constantly upgrading these systems. Computers, far from cutting the workforce, bloated it. More compelling still was the 'time lag' hypothesis offered by economist Paul A. David. New technological regimes, he contended, generate intense conflict, regulatory battles and struggles for market share. Along the way, older ways of doing things persist alongside the new, even as much of the world is remade to accommodate the new technology. None of this translates into immediate efficiency—in fact, quite the opposite. As evidence, he cited the advent of electricity, a quicker source of manufacturing power than the steam it would eventually replace. Nonetheless, it took 40 years for the adoption of electricity to lead to increased worker efficiency. Along the way, struggles to establish industry standards, waves of consolidation, regulatory battles and the need to redesign every single factory floor made this a messy, costly and prolonged process. The computer boom would prove to be similar. These complaints did not disappear, but by the late 1990s, the American economy finally showed a belated uptick in productivity. Some economists credited it to the widespread adoption of information technology. Better late than never, as they say. However, efficiency soon declined once again, despite (or because of) the advent of the internet and all the other innovations of that era. AI is no different. The new technology will have unintended consequences, many of which will offset or even entirely undermine its efficiency. That doesn't mean AI is useless or that corporations won't embrace it with enthusiasm. Anyone expecting an overnight increase in productivity, though, will be disappointed. ©Bloomberg The author is professor of history at the University of Georgia and co-author of 'Crisis Economics: A Crash Course in the Future of Finance'.


Time of India
an hour ago
- Time of India
PayPal co-founder Peter Thiel warns of tech stagnation: 'Without AI, there's just nothing going on'
In a candid conversation on The New York Times ' podcast Interesting Times, billionaire investor and PayPal cofounder Peter Thiel offered a contrarian take on artificial intelligence. While Silicon Valley giants pitch AI as a transformational force, Thiel suggests that it may be more of a lifeboat than a rocket ship—a necessary but modest remedy to deeper societal stagnation. For Thiel, AI isn't a 'machine god' or humanity's path to immortality. But he still believes it's the only visible way out of what he calls 'technological stagnation.' The billionaire, who has invested in OpenAI, Palantir , and DeepMind, warns that despite AI's immense potential, it may still fall short of reigniting the sweeping innovation seen during the early space age or the internet boom. What AI Can and Can't Fix Thiel has long argued that society has slowed down since the 1970s in everything from energy innovation to transportation. On the podcast, he says, 'The fact that we're only talking about AI is an implicit acknowledgment that, but for AI, we are in almost total stagnation.' In short: if it weren't for artificial intelligence, there'd be little else driving excitement in tech. by Taboola by Taboola Sponsored Links Sponsored Links Promoted Links Promoted Links You May Like Play War Thunder now for free War Thunder Play Now Undo Even with his investments in some of AI's most high-profile startups, Thiel remains skeptical. 'It might be enough to create some great companies,' he admits, 'but I'm not sure it's enough to really end the stagnation.' What he yearns for are bolder moonshots—missions to Mars, cures for Alzheimer's, and deep human transformation. — vitrupo (@vitrupo) You Might Also Like: Forget BTech. Zerodha's Nikhil Kamath says only one skill will matter to stay relevant in job market in 10 years More Than Hype, Less Than Salvation Asked whether the almost religious fervor surrounding AI is justified—whether visions of digital immortality and mind-machine mergers hold water—Thiel's response is striking. He critiques transhumanism not for being unnatural, but for being 'pathetically little.' To him, simply swapping human organs or extending lifespan falls short. 'We want you to be able to change your heart and your mind and your whole body,' he says. 'And transhumanism doesn't go far enough.' At the same time, Thiel questions whether AI enthusiasts are overhyping their ambitions to raise money. 'Is it hype? Is it delusion?' he muses, casting doubt on the techno-utopian dream while reaffirming the need to try AI nonetheless. The Choice: Try or Decay Despite his skepticism, Thiel's message isn't cynical, it's urgent. 'I still think we should be trying AI,' he says. 'And that the alternative is just total stagnation.' Without innovation, he warns, society may simply 'unravel.' His remarks serve as both a caution and a call to arms: AI may not deliver transcendence, but without it, there may be nothing new left to try. As the rest of Silicon Valley rushes to deify artificial intelligence, Thiel's grounded—and unsettling—warning is this: if AI fails to spark true transformation, we may find ourselves stuck not in dystopia, but in something worse—irrelevance. You Might Also Like: Nikhil Kamath's 'lifelong learning' advice is only step one: Stanford expert shares the key skills needed to survive the AI takeover


Time of India
an hour ago
- Time of India
Tattoo designs get a personalised AI makeover
AI is the new tattoo idea generator for many who are finding it difficult to like the designs that are already available with the artists When Misbah Quadri wanted to visualise his thoughts about a meaningful tattoo design, the media professional turned to AI. "AI helped bridge the gap between concept and reality. I experimented with prompts, tweaking keywords until AI generated a design that resonated. The result was surprisingly close to what I had imagined. It even added subtle details I hadn't considered,' says Misbah. Lately, AI tools have been offering unique designs to people looking for something new in their tattoo designs. Picture coutersy: Misbah Quadri Vague ideas to workable designs You may have the perfect idea for a design, but there are some limitations even for tattoo artists when it comes to creating what you want. That is where AI comes into play. 'We often use AI apps to visualise the concepts that clients have in mind. This has become an extension of our artistic toolbox that helps explore new possibilities while retaining originality,' says Mukesh Tupkar, a Goa-based tattoo artist who uses AI tools to create base designs. Vague ideas and limited visual cues aren't a problem for AI tools. Sunny Bhanushali, a Mumbai-based tattoo artist and founder of Aliens Tattoo studios, says, 'It helps with instant visualisation along with a unique result. That said, at the end of the day, we believe that an artist's touch gives the tattoo a personal, creative and meaningful essence.' Picture courtesy: Toshiro Agarwal AI as an ideator? Not every tattoo artist is comfortable or even accepting of AI generated ideas. Some call it an insult to their craft, and some have even compared it to doing sports on steroids. But when an artist is open to exploring these possibilities, they see it as a visual representation of the client's emotions and story. For Rahul Chhabra, a Delhi-based communications manager who used an AI to design a tattoo, the design's accuracy was nearly perfect to what he had in mind. Recalling the tattoo artists' reaction to an AI generated design, he says, 'They were very open and curious. The design was neat, detailed, and easy to follow. They built a whole portfolio using my tattoo design for future clients. So, in a way, something I created with AI to be deeply personal is now also a source of inspiration for others.' AI definitely helped make my tattoo design feel more personal and refined. It gave structure to my thoughts and brought a clean, visual form to my emotions. -Rahul Chhabra, a Delhi-based communications manager We edit and refine the AI references to suit the client's preferences. AI has been helpful, but we prefer handling the creative process ourselves to ensure quality and uniqueness. -Mukesh Tupkar, a Goa-based tattoo artist Picture courtesy: Mukesh Tupkar Commonly used AI tools: Midjourney InkAI Canva Stable Diffusion DALL·E