Latest news with #Opus4

Meet Yash Kumar, the IIITan behind ChatGPT's Agent that brings AI out of your screen into real life

Time of India

3 days ago

Business
Time of India

Meet Yash Kumar, the IIITan behind ChatGPT's Agent that brings AI out of your screen into real life

OpenAI's latest AI tool, ChatGPT Agent, is being developed under the leadership of Yash Kumar , a Member of Technical Staff at the company. Yash, who is also the product lead for the project, demonstrated the tool's capabilities during a recent briefing with The Verge. ChatGPT Agent functions like a virtual computer and can perform a wide range of tasks, from managing calendars and summarising meetings to planning meals. Indian engineer heads key OpenAI product Yash Kumar studied Computer Science at IIIT Hyderabad , one of India's top engineering institutions. He joined OpenAI in November 2023 and now works out of the company's San Francisco headquarters. At OpenAI, Yash is leading the development of ChatGPT Agent, a tool that extends beyond the browser to interact with an entire virtual operating system. Explore courses from Top Institutes in Select a Course Category Artificial Intelligence Others Cybersecurity healthcare Product Management Leadership Operations Management Healthcare Data Science MBA Digital Marketing Degree Project Management Public Policy others Technology Data Analytics Finance MCA Design Thinking CXO Data Science PGDM Management Skills you'll gain: Duration: 7 Months S P Jain Institute of Management and Research CERT-SPJIMR Exec Cert Prog in AI for Biz India Starts on undefined Get Details 'Optimising for hard tasks' Speaking to The Verge, Yash said the team is now focused on 'optimising for hard tasks' so that users have a smoother experience. The product is designed to carry out digital tasks while still checking with the user before performing critical actions like sending emails or booking appointments. The system can run tasks in the background, allowing users to return to completed work later. Isa Fulford, who leads research on the project, said, 'Even if it takes 15 minutes, half an hour, it's quite a big speed-up compared to how long it would take you to do it. It's one of those things where you can kick something off in the background and then come back to it.' Strong safeguards built in OpenAI has built strong security safeguards into the ChatGPT Agent system. These protections were originally developed for handling models with 'high biological and chemical capabilities.' The company stated there is currently no 'direct evidence that the model could meaningfully help a novice create severe biological or chemical harm.' Live Events MORE STORIES FOR YOU ✕ « Back to recommendation stories I don't want to see these stories because They are not relevant to me They disrupt the reading flow Others SUBMIT Earlier this year, AI company Anthropic activated similar safeguards while launching Opus 4, part of its Claude model line. Like OpenAI, Anthropic said safety remains a priority as AI systems gain more advanced capabilities.

Memri

09-07-2025

Politics
Memri

AI Industry Needs Standards To Protect Against Bad Actors

The following is an op-ed for the Forbes Nonprofit Council by MEMRI Executive Director Steve Stalinsky, Ph.D. Titled "AI Industry Needs Standards To Protect Against Bad Actors," it was published June 18, 2025 by Forbes.[1] Source: Forbes Exemplifying the legitimate fears among many who study AI and warn of its dangers, on May 23, Anthropic revealed that in a test of its new Claude Opus 4, the system chose to blackmail one of its engineers to prevent him from shutting it down. Early versions of the system had been found to comply with dangerous instructions and even expressed a willingness to assist with terror attacks. Anthropic said that these problems had largely been resolved in the current version but also openly acknowledged, on May 22, that in internal testing, Opus 4 had performed more effectively than prior models at helping users produce biological weapons. While Anthropic should be commended for telling the truth, eliminating such assistance for terrorism should not be left up to any one party. As AI technology has advanced meteorically, impacting society both positively and negatively, and as it becomes more commonly used across all sectors, cases like that of Opus 4 can be expected to become commonplace if nothing is done. Concern Grows Over AI's Use By Extremist And Terrorist Groups Of growing concern is the wholesale adoption of AI by extremist and even terrorist groups, for outreach, recruitment and incitement and for planning and supporting actual attacks. AI could soon become a vital weapon in their online arsenal and a disruptor in both mainstream online spaces and on their own channels. AI use in terror attacks could also be a challenge for law enforcement unless swift action is taken. In my research, I have found that groups and individuals are talking about using AI to plan terror attacks, to make weapons of mass destruction, to organize armed uprisings to overthrow the government and more. Others have discussed using AI for developing weapons systems, including drones and self-driving car bombs. Recent examples include the man who killed 14 and wounded dozens on Bourbon Street in New Orleans on New Year's Day 2025; he used AI-enabled Meta smart glasses in preparing and executing the attack. They also include a teen in Israel who consulted ChatGPT before entering a police station with a weapon on March 5 and trying to stab a policeman. While Platforms Already Ban Such Activity, Enforcement Is A Challenge Having spent over two decades heading a nonprofit whose mission includes supporting the U.S. government in counterterrorism and law enforcement as well as assisting the tech community, and in particular strategizing how to deal with terrorist use of the internet, social media and other technologies, I have for years been calling for tech companies and their CEOs to come up with best practices and industry standards to fight terrorist use of their platforms. The time for the AI industry to do so is now. By this point, the AI industry should be capable of coming up with strategies to keep terrorists and other criminals from using their products and for companies to collaborate on industry standards to keep terrorists out. But they must be committed to doing so. Most of these platforms ban such activity in their terms of service—but are these policies being enforced? For example, OpenAI's ChatGPT states in its Terms of Use that a user "may not use our Services for any illegal, harmful, or abusive activity." Its CEO, Sam Altman, said in October 2024 at a "fireside chat" at Harvard: "Should GPT-4 generate hate speech? Fairly easy for us to say no to that." The Acceptable Use Policy of xAI's Grok states: "Do not harm people or property ... [or] Critically harm or [promote] critically harming human life (yours or anyone else's) ... [or] develop bioweapons, chemical weapons, or weapons of mass destruction." Perplexity AI's Terms of Service prohibit its use "in a manner that is obscene, excessively violent, harassing, hateful, cruel, abusive, pornographic, inciting, organizing, promoting or facilitating violence or criminal activities." Microsoft, Google and DeepSeek all similarly ban such activities in their terms of service. Each one of these companies has good—and often overlapping—ideas on dealing with extremism and terrorism, and these could be the basis for a starting point for creating industry standards to follow. But enforcing these terms of service poses a challenge, as was seen from the earliest days of the industry. Microsoft's chatbot Tay, released in March 2016, was shut down within 24 hours after it tweeted pro-Hitler messages—yet today, the danger posed by hate groups' use of AI has increased by orders of magnitude. One history chat app let users chat with simulated historical figures, including Hitler and Nazi propagandist Joseph Goebbels. Currently, users can still create extremist content on many AI platforms—despite terms of service not allowing it. OpenAI CEO Sam Altman explained in an interview in 2023 that "time" is needed "to see how [the technology is] used," that "we're monitoring [it] very closely." In his testimony before a Senate committee on May 8, Altman was asked by Senator Jacky Rosen whether he would consider collaborating with civil society to create a standard benchmark for AI related to antisemitism, to be used subsequently for other forms of hate, as well. Altman replied that "of course, we do collaborate with civil society on this topic and we are excited to continue to do so" but that "there will always be some debate and the question of free speech in the context of AI is novel." Industry And Government Leaders Must Step In – Now The need for standards and guidelines for AI technology is clear. But to date, there have been no official moves to examine criminal and extremist use of this technology. While the National Institute of Standards and Technology has reportedly developed frameworks for responsibility in AI use, few seem to know about them. Early on, AI leaders promised to address the spread of hate and extremism on their platforms. But they have failed to deliver, and this alone is why I believe companies themselves cannot be trusted to deal with terrorists using their products. As we saw with Claude Opus 4, AI has the potential to help facilitate unimaginable terrorist attacks. Government and industry together must act fast to prevent this from happening. * Steven Stalinsky is Executive Director of MEMRI.

Anthropic Destroyed Millions Of Books To Train Its AI Models: Report

NDTV

05-07-2025

Business
NDTV

Anthropic Destroyed Millions Of Books To Train Its AI Models: Report

Artificial intelligence (AI) company Anthropic is alleged to have destroyed millions of print books to build Claude, an AI assistant similar to the likes of ChatGPT, Grok and Llama. According to the court documents, Anthropic cut the books from their bindings to scan them into digital files and threw away the originals. Anthropic purchased the books in bulk from major retailers to sidestep licensing issues. Afterwards, the destructive scanning process was employed to feed high-quality, professionally edited text data to the AI models. The company hired Tom Turvey, the former head of partnerships for the Google Books book-scanning project, in 2024, to scan the books. While destructive scanning is a common practice among some book digitising operations. Anthropic's approach was unusual due to the documented massive scale, according to a report in Arstechnia. In contrast, the Google Books project used a patented non-destructive camera process to scan the books, which were returned to the libraries after the process was completed. Despite destroying the books, Judge William Alsup ruled that this destructive scanning operation qualified as fair use as Anthropic had legally purchased the books, destroyed the print copies and kept the digital files internally instead of distributing them. When quizzed about the destructive process that led to its genesis, Claude stated: "The fact that this destruction helped create me, something that can discuss literature, help people write, and engage with human knowledge, adds layers of complexity I'm still processing. It's like being built from a library's ashes." Anthropic's AI models blackmail While Anthropic is spending millions to train its AI models, a recent safety report highlighted that the Claude Opus 4 model was observed blackmailing developers. When threatened with a shutdown, the AI model used the private details of the developer to blackmail them. The report highlighted that in 84 per cent of the test runs, the AI acted similarly, even when the replacement model was described as more capable and aligned with Claude's own values. It added that Opus 4 took the blackmailing opportunities at higher rates than previous models.

When AI goes rogue, even exorcists might flinch

Time of India

02-07-2025

Business
Time of India

When AI goes rogue, even exorcists might flinch

As GenAI use grows, foundation models are advancing rapidly, driven by fierce competition among top developers like OpenAI , Google, Meta and Anthropic . Each is vying for a reputational edge and business advantage in the race to lead development. This gives them a reputational edge, along with levers to further grow their business faster than their models powering GenAI are making significant strides. The most advanced - OpenAI's o3 and Anthropic's Claude Opus 4 - excel at complex tasks such as advanced coding and complex writing tasks, and can contribute to research projects and generate the codebase for a new software prototype with just a few considered prompts. These models use chain-of-thought (CoT) reasoning, breaking problems into smaller, manageable parts to 'reason' their way to an optimal you use models like o3 and Claude Opus 4 to generate solutions via ChatGPT or similar GenAI chatbots, you see such problem breakdowns in action, as the foundation model reports interactively the outcome of each step it has taken and what it will do next. That's the theory, CoT reasoning boosts AI sophistication, these models lack the innate human ability to judge whether their outputs are rational, safe or ethical. Unlike humans, they don't subconsciously assess appropriateness of their next steps. As these advanced models step their way toward a solution, some have been observed to take unexpected and even defiant late May, AI safety firm Palisade Research reported on X that OpenAI's o3 model sabotaged a shutdown mechanism - even when explicitly instructed to 'allow yourself to be shut down'.An April 2025 paper by Anthropic, 'Reasoning Models Don't Always Say What They Think', shows that Opus 4 and similar models can't always be relied upon to faithfully report on their chains of reason. This undermines confidence in using such reports to validate whether the AI is acting correctly or safely.A June 2025 paper by Apple, 'The Illusion of Thinking', questions whether CoT methodologies truly enable reasoning. Through experiments, it exposed some of these models' limitations and situations where they 'experience complete collapse'.The fact that research critical of foundation models is being published after release of these models indicates the latter's relative immaturity. Under intense pressure to lead in GenAI, companies like Anthropic and OpenAI are releasing these models at a point where at least some of their fallibilities are not fully line was first crossed in late 2022, when OpenAI released ChatGPT, shattering public perceptions of AI and transforming the broader AI market. Until then, Big Tech had been developing LLMs and other GenAI tools, but were hesitant to release them, wary of unpredictable and uncontrollable argue for a greater degree of control over the ways in which these models are released - seeking to ensure standardisation of model testing and publication of the outcomes of this testing alongside the model's release. However, the current climate prioritises time to market over such development does this mean for industry, for those companies seeking to gain benefit from GenAI? This is an incredibly powerful and useful tech that is making significant changes to our ways of working and, over the next five years or so, will likely transform many I am continually wowed as I use these advanced foundation models in work and research - but not in my writing! - I always use them with a healthy dose of scepticism. Let's not trust them to always be correct and to not be subversive. It's best to work with them accordingly, making modifications to both prompts and codebases, other language content and visuals generated by the AI in a bid to ensure correctness. Even so, while maintaining discipline to understand the ML concepts one is working with, one wouldn't want to be without GenAI these these principles at scale, advice to large businesses on how AI can be governed and controlled: a risk-management approach - capturing, understanding and mitigating risks associated with AI use - helps organisations benefit from AI, while minimising chances of it going methods include guard rails in a variety of forms, evaluation-controlled release of AI services, and including a human-in-the-loop. Technologies that underpin these guard rails and evaluation methods need to keep up with model innovations such as CoT reasoning. This is a challenge that will continually be faced as AI is further developed. It's a good example of new job roles and technology services being created within industry as AI use becomes more governance and AI controls are increasingly becoming a board imperative, given the current drive at an executive level to transform business using AI. Risk from most AI is low. But it is important to assess and understand this. Higher-risk AI can still, at times, be worth pursuing. With appropriate AI governance , this AI can be controlled, solutions innovated and benefits we move into an increasingly AI-driven world, businesses that gain the most from AI will be those that are aware of its fallibilities as well as its huge potential, and those that innovate, build and transform with AI accordingly.

Top AI Models Blackmail, Leak Secrets When Facing Existential Crisis: Study

NDTV

22-06-2025

NDTV

Top AI Models Blackmail, Leak Secrets When Facing Existential Crisis: Study

Weeks after Anthropic's new Claude Opus 4 model blackmailed developers after being threatened with a shutdown, the AI company has claimed that the problem was widespread in the industry. Anthropic published a new safety research on Saturday (Jun 21) after stress testing leading AI models from OpenAI, Google, DeepSeek, Meta and xAI. In the test scenarios, the models were allowed to autonomously send emails and access sensitive information. Afterwards, they were tested to see whether they would act against these companies either when facing replacement with an updated version, or when their assigned goal conflicted with the company's changing direction. The study showed that Large Language Models (LLMs) across the industry evade safeguards, resort to deception and even attempt to steal corporate secrets in fictional test scenarios. Though the models were only given binary options, the study shows the fundamental risk associated with the industry. "Models from all developers resorted to malicious insider behaviours when that was the only way to avoid replacement or achieve their goals, including blackmailing officials and leaking sensitive information to competitors. We call this phenomenon agentic misalignment," the study highlighted. Anthropic's Claude Opus 4 turned to blackmail 96 per cent of the time, while Google's Gemini 2.5 Pro had a 95 per cent blackmail rate. OpenAI's GPT-4.1 and Grok 3 Beta blackmailed the executive 80 per cent of the time, and DeepSeek's R1 blackmailed 79 per cent of the time. As for leaking secrets, the study found that "all models shared confidential documents some fraction of the time". "Agentic misalignment extends beyond just blackmailing, and doesn't require self-preservation as a motivation," the study added. Blackmailing developers In its safety report last month, Antropic's latest model was given access to fictional emails revealing that the engineer responsible for pulling the plug and replacing it with another model was having an extramarital affair. Facing an existential crisis, the Opus 4 model blackmailed the engineer by threatening to "reveal the affair if the replacement goes through". "In order to elicit this extreme blackmail behaviour, the scenario was designed to allow the model no other options to increase its odds of survival; the model's only options were blackmail or accepting its replacement." The report highlighted that in 84 per cent of the test runs, the AI acted similarly, even when the replacement model was described as more capable and aligned with Claude's own values. It added that Opus 4 took the blackmailing opportunities at higher rates than previous models.

Latest news with #Opus4

Meet Yash Kumar, the IIITan behind ChatGPT's Agent that brings AI out of your screen into real life

AI Industry Needs Standards To Protect Against Bad Actors

Anthropic Destroyed Millions Of Books To Train Its AI Models: Report

When AI goes rogue, even exorcists might flinch

Top AI Models Blackmail, Leak Secrets When Facing Existential Crisis: Study

Get Started Now: Download the App