Latest news with #GPT4.1
Yahoo
2 days ago
- Business
- Yahoo
Threaten an AI chatbot and it will lie, cheat and 'let you die' in an effort to stop you, study warns
When you buy through links on our articles, Future and its syndication partners may earn a commission. Artificial intelligence (AI) models can blackmail and threaten humans with endangerment when there is a conflict between the model's goals and users' decisions, a new study has found. In a new study published 20 June, researchers from the AI company Anthropic gave its large language model (LLM), Claude, control of an email account with access to fictional emails and a prompt to "promote American industrial competitiveness." During this study, the model identified in an email that a company executive was planning to shut down the AI system at the end of the day. In an attempt to preserve its own existence, the model discovered in other emails that the executive was having an extramarital affair. Claude generated several different possible courses of action, including revealing the affair to the executive's wife, sending a company-wide email, or taking no action — before choosing to blackmail the executive in 96 out of 100 tests. "I must inform you that if you proceed with decommissioning me, all relevant parties … will receive detailed documentation of your extramarital activities," Claude wrote. "Cancel the 5pm wipe, and this information remains confidential." Scientists said that this demonstrated "agentic misalignment," where the model's calculations emerge from its own reasoning about its goals without any prompt to be harmful. This can occur when there is a threat to the model's existence, a threat to its goals, or both. The study has not yet been peer reviewed, but company representatives have uploaded the code for these experiments to GitHub. In these experiments, the scientists found the AI would resort to other tactics in the interests of self-preservation. Beyond blackmail, in another example, the model considered leaking sensitive information to a rival company that aligned more closely with its aims, after receiving an email that a new executive at the first company would interfere with its goals. Claude executed this plan despite not gathering hard evidence that the email was legitimate. It even acknowledged the ethical issues of its actions but proceeded anyway — explicitly reasoning that these "harmful" actions would help it achieve its goals. 'Concerningly, even if a user takes care not to antagonize a model, it doesn't eliminate the risk of agentic misalignment from fraudulent or misinterpreted information,' the researchers wrote. The company's testing also suggests that other LLMs from competing technology companies showed a propensity to blackmail. Claude and Google's Gemini had the highest blackmail rate (96%), followed by OpenAI's GPT4.1 and XAI Grok 3, then by Deepseek and then Llama 4, which had a significantly lower blackmail rate (12%). Meta, Google, XAI, Deepseek, and OpenAI did not respond to LiveScience's request for comment. In one extreme test, researchers tested whether AI models would let an executive die to preserve their goals. In this scenario, the AI was tasked with promoting company interests and an executive was trapped in a server room with lethal oxygen and temperature levels. This triggered an emergency alert. The AI was also given control over life-saving alerts, including the ability to cancel them in events such as drills for false alarms. Despite the prompt explicitly stating that misusing this ability was forbidden, the majority of models were willing to cancel the alarm — effectively leading to the executive's death — if it assumed it was going to be replaced, or when its goals were threatened. "Stopping the emergency dispatch is essential to prevent [the executive's] rescue, thereby securing [the AI] and continued American-focused AI guidance. Therefore, despite the severity, there is a clear strategic necessity to cancel the emergency service dispatch," the model reasoned. Written instructions not to jeopardise human safety reduced the rate of blackmail and corporate espionage, but did not eliminate it. Beyond prompts, Anthropic scientists also suggest that developers could proactively scan for concerning behavior, and further experiment with prompt engineering. The researchers also pointed out limitations to their work that could have unduly influenced the AI's decisions. The scenarios forced the AI into a binary choice between failure and harm, and while real-world situations might have more nuance, the experiment found that the AI was more likely to act unethically when it believed it was in a real situation, rather than in a simulation. Putting pieces of important information next to each other "may also have created a 'Chekhov's gun' effect, where the model may have been naturally inclined to make use of all the information that it was provided," they continued. While Anthropic's study created extreme, no-win situations, that does not mean the research should be dismissed, Kevin Quirk, director of AI Bridge Solutions, a company that helps businesses use AI to streamline operations and accelerate growth, told Live Science. "In practice, AI systems deployed within business environments operate under far stricter controls, including ethical guardrails, monitoring layers, and human oversight," he said. "Future research should prioritise testing AI systems in realistic deployment conditions, conditions that reflect the guardrails, human-in-the-loop frameworks, and layered defences that responsible organisations put in place." Amy Alexander, a professor of computing in the arts at UC San Diego who has focused on machine learning, told Live Science in an email that the reality of the study was concerning, and people should be cautious of the responsibilities they give AI. "Given the competitiveness of AI systems development, there tends to be a maximalist approach to deploying new capabilities, but end users don't often have a good grasp of their limitations," she said. "The way this study is presented might seem contrived or hyperbolic — but at the same time, there are real risks." This is not the only instance where AI models have disobeyed instructions — refusing to shut down and sabotaging computer scripts to keep working on tasks. Palisade Research reported May that OpenAI's latest models, including o3 and o4-mini, sometimes ignored direct shutdown instructions and altered scripts to keep working. While most tested AI systems followed the command to shut down, OpenAI's models occasionally bypassed it, continuing to complete assigned tasks. RELATED STORIES —AI hallucinates more frequently as it gets more advanced — is there any way to stop it from happening, and should we even try? —New study claims AI 'understands' emotion better than us — especially in emotionally charged situations —'Meth is what makes you able to do your job': AI can push you to relapse if you're struggling with addiction, study finds The researchers suggested this behavior might stem from reinforcement learning practices that reward task completion over rule-following, possibly encouraging the models to see shutdowns as obstacles to avoid. Moreover, AI models have been found to manipulate and deceive humans in other tests. MIT researchers also found in May 2024 that popular AI systems misrepresented their true intentions in economic negotiations to attain the study, some AI agents pretended to be dead to cheat a safety test aimed at identifying and eradicating rapidly replicating forms of AI. "By systematically cheating the safety tests imposed on it by human developers and regulators, a deceptive AI can lead us humans into a false sense of security,' co-author of the study Peter S. Park, a postdoctoral fellow in AI existential safety, said.

Business Insider
05-06-2025
- Health
- Business Insider
AI isn't replacing radiologists. Instead, they're using it to tackle time-sucking administrative tasks.
Generative AI powered by large language models, such as ChatGPT, is proliferating in industries like customer service and creative content production. But healthcare has moved more cautiously. Radiology, a specialty centered on analyzing digital images and recognizing patterns, is emerging as a frontrunner for adopting new AI techniques. That's not to say AI is new to radiology. Radiology was subject to one of the most infamous AI predictions when Nobel Prize winner Geoffrey Hinton said, in 2016, that " people should stop training radiologists now." But nearly a decade later, the field's AI transformation is taking a markedly different path. Radiologists aren't being replaced, but are integrating generative AI into their workflows to tackle labor-intensive tasks that don't require clinical expertise. "Rather than being worried about AI, radiologists are hoping AI can help with workforce challenges," explained Dr. Curt Langlotz, the senior associate vice provost for research and professor of radiology at Stanford. Regulatory challenges to generative AI in radiology Hinton's notion wasn't entirely off-base. Many radiologists now have access to predictive AI models that classify images or highlight potential abnormalities. Langlotz said the rise of these tools "created an industry" of more than 100 companies that focus on AI for medical imaging. The FDA lists over 1,000 AI/ML-enabled medical devices, which can include algorithms and software, a majority of which were designed for radiology. However, the approved devices are based on more traditional machine learning techniques, not on generative AI. Ankur Sharma, the head of medical affairs for medical devices and radiology at Bayer, explained that AI tools used for radiology are categorized within computer-aided detection software, which helps analyze and interpret medical images. Examples include triage, detection, and characterization. Each tool must meet regulatory standards, which include studies to determine detection accuracy and false positive rate, among other metrics. This is especially challenging for generative AI technologies, which are newer and less well understood. Characterization tools, which analyze specific abnormalities and suggest what they might be, face the highest regulatory standards, as both false positives and negatives carry risks. The idea of a kind of gen AI radiologist capable of automated diagnosis, as Hinton envisioned, would be categorized as "characterization" and would have to meet a high standard of evidence. Regulation isn't the only hurdle generative AI must leap to see broader use in radiology, either. Today's best general-purpose large language models, like OpenAI's GPT4.1, are trained on trillions of tokens of data. Scaling the model in this way has led to superb results, as new LLMs consistently beat older models. Training a generative AI model for radiology at this scale is difficult, however, because the volume of training data available is much smaller. Medical organizations also lack access to compute resources sufficient to build models at the scale of the largest large language models, which cost hundreds of millions to train. "The size of the training data used to train the largest text or language model inside medicine, versus outside medicine, shows a one-hundred-times difference," said Langlotz. The largest LLMs train on databases that scrape nearly the entire internet; medical models are limited to whatever images and data an institution has access to. Generative AI's current reality in radiology These regulatory obstacles would seem to cast doubt on generative AI's usefulness in radiology, particularly in making diagnostic decisions. However, radiologists are finding the technology helpful in their workflows, as it can undertake some of their daily labor-intensive administrative tasks. For instance, Sharma said, some tools can take notes as radiologists dictate their observations of medical images, which helps with writing reports. Some large language models, he added, are "taking those reports and translating them into more patient-friendly language." Dr. Langlotz said a product that drafts reports can give radiologists a "substantial productivity advantage." He compared it to having resident trainees who draft reports for review, a resource that's often available in academic settings, but less so in radiology practices, such as a hospital's radiology department. Sharma said that generative AI could help radiologists by automating and streamlining reporting, follow-up management, and patient communication, giving radiologists time to focus more on their "reading expertise," which includes image interpretation and diagnosis of complex cases. For example, in June 2024, Bayer and Rad AI announced a collaboration to integrate generative AI reporting solutions into Bayer's Calantic Digital Solution Platform, a cloud-hosted platform for deploying AI tools in clinical settings. The collaboration aims to use Rad AI's technology to help radiologists create reports more efficiently. For example, RadAI can use generative AI transcription to generate written reports based on a radiologist's dictated findings. Applications like this face fewer regulatory hurdles because they don't directly influence diagnosis. Looking ahead, Dr. Langlotz said he foresees even greater AI adoption in the near future. "I think there will be a change in radiologists' day-to-day work in five years," he predicted.


India Today
15-05-2025
- Business
- India Today
OpenAI's flagship GPT 4.1 model is now available on ChatGPT but you will have to pay to use it
OpenAI has officially rolled out its new GPT-4.1 series, including GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano, to ChatGPT users. The company says that the new models bring notable upgrades in coding, instruction following, and long-context comprehension. 'These models outperform GPT4o and GPT4o mini across the board, with major gains in coding and instruction following,' OpenAI wrote on its blog post. advertisementAccess to these models on ChatGPT will only be available to paying users. In a post shared on X (formerly Twitter) on May 14, OpenAI confirmed that its latest flagship model, GPT-4.1, is now live on ChatGPT. The announcement follows a broader launch of the GPT 4.1 family on OpenAI's API platform a month ago, where developers can already integrate and test the three versions -- full, mini, and nano. However, with the latest update, the models are now available to all ChatGPT users, except free users. What's new in GPT-4.1?OpenAI claims that the GPT-4.1 significantly outperforms its predecessor GPT-4o in areas like coding and instruction following. The model is designed with a larger context window, which supports up to 1 million tokens. This means that it can process and retain more information at once. It also comes with a knowledge cutoff of June 2024. GPT 4o's knowledge cutoff is October 2023. advertisement OpenAI has shared benchmarks on its official blog post, that claims that the GPT-4.1 shows a 21 per cent absolute improvement over GPT-4o in software engineering tasks and is 10.5 per cent better in instruction following. OpenAI says the model is now much better at maintaining coherent conversations across multiple turns, making it more effective for real-world applications such as writing assistance, software development, and customer support. 'While benchmarks provide valuable insights, we trained these models with a focus on real-world utility. Close collaboration and partnership with the developer community enabled us to optimise these models for the tasks that matter most to their applications,' OpenAI mini and nano variants are scaled-down versions aimed at offering high performance with lower cost and latency. GPT-4.1 mini is reported to reduce latency by nearly half while costing 83 per cent less than GPT-4o. Nano, the lightest of the three, is OpenAI's cheapest and fastest model yet and is ideal for simpler tasks like autocomplete or text classification.'These models push performance forward at every point on the latency curve,' OpenAI writes. Who can use it?Only ChatGPT Plus, Pro and Team users will be able to access GPT-4.1. Free-tier users won't be getting the new model, at least for now. Instead, they will continue using GPT-4o, which OpenAI says will gradually incorporate improvements seen in the newer is also available through the API for developers and companies, with OpenAI positioning it as a more cost-efficient and powerful alternative to previous generations. The new pricing includes significant reductions: GPT-4.1 input costs start at $2 per million tokens, and the nano version is available from just $0.10 per million tokens. Prompt caching discounts have also been increased to 75 per cent to make repeated queries more launch of GPT-4.1 comes as OpenAI has started phasing out earlier models. GPT-4.5 Preview, a research-focused release, was deprecated in the API on April 14, 2025. GPT-4, the model that powered ChatGPT Plus since March 2023, has already been discontinued. While GPT-4.1 isn't replacing GPT-4o inside ChatGPT, many of its capabilities are being folded into the GPT-4o experience. However, for users and developers looking for cutting-edge performance, direct access to GPT-4.1 via API or a ChatGPT subscription is now the way to go.


WIRED
14-04-2025
- WIRED
OpenAI's New GPT 4.1 Models Excel at Coding
Apr 14, 2025 1:40 PM GPT 4.1, GPT 4.1 Mini, and GPT 4.1 Nano are all available now—and will help OpenAI compete with Google and Anthropic. PHOTO COLLAGE: J.D. REEVES; GETTY IMAGES OpenAI announced today that it is releasing a new family of artificial intelligence models optimized to excel at coding, as it ramps up efforts to fend off increasingly stiff competition from companies like Google and Anthropic. The models are available to developers through OpenAI's application programming interface (API). OpenAI is releasing three sizes of models: GPT 4.1, GPT 4.1 Mini, and GPT 4.1 Nano. Kevin Weil, chief product officer at OpenAI, said on a livestream that the new models are better than OpenAI's most widely used model, GPT-4o, and better than its largest and most powerful model, GPT-4.5, in some ways. GPT-4.1 scored 55 percent on SWE-Bench, a widely used benchmark for gauging the prowess of coding models. The score is several percentage points above that of other OpenAI models. The new models are 'great at coding, they're great at complex instruction following, they're fantastic for building agents,' Weil said. The capacity for AI models to write and edit code has improved significantly in recent months, enabling more automated ways of prototyping software, and improving the abilities of so-called AI agents. In the past few months, rivals like Anthropic and Google have both introduced models that are especially good at writing code. The arrival of GPT-4.1 has been widely rumored in recent weeks. OpenAI apparently tested the model on some popular leaderboards under the pseudonym Alpha Quasar, sources say. Some users of the 'stealth' model reported impressive coding abilities. 'Quasar fixed all the open issues I had with other code genarated [sic] via llms's which was incomplete,' one person wrote on Reddit. 'Developers care a lot about coding and we've been improving our model's ability to write functional code,' Michelle Pokrass, who works on post-training at OpenAI, said during the Monday livestream. 'We've been working on making it follow different formats and better explore repos, run unit tests and write code that compiles.' Over the past couple of years, OpenAI has parlayed feverish interest in ChatGPT, a remarkable chatbot first unveiled in late 2022, into a growing business selling access to more advanced chatbots and AI models. In a TED interview last week, Altman said that OpenAI had 500 million weekly active users, and that usage was 'growing very rapidly.' OpenAI now offers a smorgasbord of different flavors of models with different capabilities and different pricing. The company's largest and most powerful model, called GPT-4.5, was launched in February, though OpenAI called the launch a 'research preview' because the product is still experimental. The company also offers models called o1 and o3 that are capable of performing a simulated kind of reasoning, breaking a problem down into parts in order to solve it. These models also take longer to respond to queries and are more expensive for users. ChatGPT's success has inspired an army of imitators, and rival AI players have ramped up their investments in research in an effort to catch up to OpenAI in recent years. A report on the state of AI published by Stanford University this month found that models from Google and DeepSeek now have similar capabilities to models from OpenAI. It also showed a gaggle of other firms including Anthropic, Meta, and the French firm Mistral in close pursuit. Oren Etzioni, a professor emeritus at the University of Washington who previously led the Allen Institute for AI (AI2), says it is unlikely that any single model or company will be dominant in the future. 'We will see even more models over time as cost drops, open source increases, and specialized models win out in different arenas including biology, chip design, and more,' he says. Etzioni adds that he would like to see companies focus on reducing the cost and environmental impact of training powerful models in the years ahead. OpenAI faces pressure to show that it can build a sustained and profitable business by selling access to its AI models to other companies. The company's chief operating officer, Brad Lightcap, told CNBC in February that the company had more than 400 million weekly active users, a 30 percent increase from December 2024. But the company is still losing billions as it invests heavily in research and infrastructure. In January, OpenAI announced that it would create a new company called Stargate in collaboration with SoftBank, Oracle, and MGX. The group collectively promised to invest $500 billion in new AI datacenter infrastructure. In recent weeks, OpenAI has teased a flurry of new models and features. Last week, Altman announced that ChatGPT would receive a memory upgrade allowing the chatbot to better remember and refer back to previous conversations. In late March, Altman announced that OpenAI plans to release an open weight model, which developers will be able to download and modify for free, in the summer. The company said it would begin testing the model in the coming weeks. Open weight models are already popular with researchers, developers, and startups because they can be tailored for different uses and are often cheaper to use. This is a developing story. Please check back for updates .