Latest news with #AISafetySummit

Google, you broke your word on …, shout protestors outside Google Deepmind's London headquarters

Time of India

16 hours ago

Business
Time of India

Google, you broke your word on …, shout protestors outside Google Deepmind's London headquarters

Dozens of protesters staged a mock courtroom trial outside Google DeepMind 's London headquarters Monday, accusing the AI giant of breaking public safety promises made during the launch of its Gemini 2.5 Pro model. Tired of too many ads? go ad free now The demonstration, organized by activist group PauseAI , drew over 60 participants who chanted "Test, don't guess" and "Stop the race, it's unsafe" while conducting a theatrical trial complete with a judge and jury. The group claims violated commitments made at the 2024 AI Safety Summit in Seoul, where the company pledged to involve external evaluators in testing its advanced AI models and publish detailed transparency reports. When Google released Gemini 2.5 Pro in April, it labeled the model "experimental" and initially provided no third-party evaluation details. A safety report published weeks later was criticized by experts as lacking substance and failing to identify external reviewers. Companies less regulated than sandwich shops, says protestors "Right now, AI companies are less regulated than sandwich shops," said PauseAI organizing director Ella Hughes, addressing the crowd. "If we let Google get away with breaking their word, it sends a signal to all other labs that safety promises aren't important." The protest reflects growing public concern about AI development pace and oversight. PauseAI founder Joep Meindertsma , who runs a software company and uses AI tools from major providers, said the group chose to focus on this specific transparency issue as an achievable near-term goal. Monday marked PauseAI's first demonstration targeting this particular Google commitment. The group is now engaging with UK Parliament members to escalate their concerns through political channels. Google has not responded to requests for comment about the protesters' demands or future transparency plans for its AI models.

Protesters accuse Google of breaking its promises on AI safety: 'AI companies are less regulated than sandwich shops'

Business Insider

2 days ago

Business
Business Insider

Protesters accuse Google of breaking its promises on AI safety: 'AI companies are less regulated than sandwich shops'

A full-blown courtroom drama — complete with a gavel-wielding judge and an attentive jury, played out in London's King's Cross on Monday, mere steps away from Google DeepMind 's headquarters. Google was on trial for allegations of breaking its promises on AI safety. The participants of this faux-production were protesters from PauseAI, an activist group concerned that tech companies are racing into AI with little regard for safety. On Monday, the group congregated near King's Cross station to demand that Google be more transparent about the safety checks it's running on its most cutting-edge AI models. PauseAI argues that Google broke a promise it made during the 2024 AI Safety Summit in Seoul, Korea, when the company agreed to consider external evaluations of its models and publish details about how external parties, including governments, were involved in assessing the risks. When Google launched Gemini 2.5 Pro, its latest frontier model, in April, it did neither of those things. The company said it was because the model was still "experimental." A few weeks later, it released a "model card" with some safety details, which some experts criticized for being too thin on details, TechCrunch previously reported. While the safety report made reference to third-party testers, it did not specify who they were. For PauseAI, this isn't good enough. More importantly, the organization said, it's about not letting any lapse slip by and allowing Google to set a precedent. "If we let Google get away with breaking their word, it sends a signal to all other labs that safety promises aren't important and commitments to the public don't need to be kept," said PauseAI organizing director Ella Hughes, addressing the crowd, which had gradually swelled to around 60 people. "Right now, AI companies are less regulated than sandwich shops." There's a lot to worry about when it comes to AI. Economic disruption. Job displacement. Misinformation. Deepfakes. The annihilation of humanity as we know it. Focusing on the specific issue of the Google safety report is a way for PauseAI to push for a specific and attainable near-term change. About 30 minutes into the protest, several intrigued passers-by had joined the cause. After a rousing speech from Hughes, the group proceeded to Google DeepMind's offices, where the fake courtroom production played out. Some Google employees leaving for the day looked bemused as chants of "Stop the race, it's unsafe" and "Test, don't guess" rang out. "AI regulation on an international level is in a very bad place," PauseAI founder Joep Meindertsma told Business Insider, pointing to how US Vice President JD Vance warned against over-regulating AI at the AI Action Summit. Monday was the first time PauseAI had gathered over this specific issue, and it's not clear what comes next. The group is engaging with members of UK parliament who will run these concerns up the flagpole, but Meindertsma is reticent to say much about how Google is engaging with the group and their demands (a Google spokesperson did not respond to a request for comment for this story). Meindertsma hopes support will grow and references polls that suggest the public at large is concerned that AI is moving too fast. The group on Monday was made up of people from different backgrounds, including some who work in tech. Meindertsma himself runs a software development company and regularly uses AI tools from Google, OpenAI, and others. "Their tools are incredibly impressive," he said, "which is the thing that worries me so much."

AI regulation does not stifle innovation

New Statesman

13-06-2025

Business
New Statesman

AI regulation does not stifle innovation

Photo credit: Claudenakagawa / Shutterstocl Ever since co-founding the All-Party Parliamentary Group on AI nine years ago, still ably administered by the Big Innovation Centre, I've been deeply involved in debating and advising on the implications of artificial intelligence. My optimism about AI's potential remains strong – from helping identify new Parkinson's treatments to DeepMind's protein structure predictions that could transform drug discovery and personalised medicine. Yet this technology is unlike anything we've seen before. It's potentially more autonomous, with greater impact on human creativity and employment, and more opaque in its decision-making processes. The conventional wisdom that regulation stifles innovation needs turning on its head. As AI becomes more powerful and pervasive, appropriate regulation isn't just about restricting harmful practices – it's key to driving widespread adoption and sustainable growth. Many potential AI adopters are hesitating not due to technological limitations but Tim Clement-Jones Liberal Democrat peer and spokesperson for the digital economy uncertainties about liability, ethical boundaries and public acceptance. Clear regulatory frameworks addressing algorithmic bias, data privacy and decision transparency can actually accelerate adoption by providing clarity and confidence. Different jurisdictions are adopting varied approaches. The European Union's AI Act, with its risk-based framework, started coming into effect this year. Singapore has established comprehensive AI governance through its model AI governance framework. Even China regulates public-facing generative AI models with fairly heavy inspection regimes. The UK's approach has been more cautious. The previous government held the AI Safety Summit at Bletchley Park and established the AI Safety Institute (now inexplicably renamed the AI Security Institute), but with no regulatory teeth. The current government has committed to binding regulation for companies developing the most powerful AI models, though progress remains slower than hoped. Notably, 60 countries – including Saudi Arabia and the UAE, but not Britain or the US – signed the Paris AI Action Summit declaration in February this year, committing to ensuring AI is 'open, inclusive, transparent, ethical, safe, secure and trustworthy'. Several critical issues demand urgent attention. Intellectual property: the use of copyrighted material for training large language models without licensing has sparked substantial litigation and, in the UK, unprecedented parliamentary debate. Governments need to act decisively to ensure creative works aren't ingested into generative AI models without return to rights-holders, with transparency duties on developers. Digital citizenship: we must equip citizens for the AI age, ensuring they understand how their data is used and AI's ethical implications. Beyond the UAE, Finland and Estonia, few governments are taking this seriously enough. Subscribe to The New Statesman today from only £8.99 per month Subscribe International convergence: despite differing regulatory regimes, we need developers to collaborate and commercialise innovations globally while ensuring consumer trust in common international ethical and safety standards. Well-designed regulation can be a catalyst for AI adoption and innovation. Just as environmental regulations spurred cleaner technologies, AI regulations focusing on explainability and fairness could push developers toward more sophisticated, responsible systems. The goal isn't whether to regulate AI, but how to regulate it promoting both innovation and responsibility. We need principles-based rather than overly prescriptive regulation, assessing risk and emphasising transparency and accountability without stifling creativity. Achieving the balance between human potential and machine innovation isn't just possible – it's necessary as we step into an increasingly AI-driven world. That's what we must make a reality. This article first appeared in our Spotlight on Technology supplement, of 13 June 2025. Related

Opinion: AI sometimes deceives to survive. Does anybody care?

The Star

31-05-2025

Politics
The Star

Opinion: AI sometimes deceives to survive. Does anybody care?

You'd think that as artificial intelligence becomes more advanced, governments would be more interested in making it safer. The opposite seems to be the case. Not long after taking office, the Trump administration scrapped an executive order that pushed tech companies to safety test their AI models, and it also hollowed out a regulatory body that did that testing. The state of California in September 2024 spiked a bill forcing more scrutiny on sophisticated AI models, and the global AI Safety Summit started by the UK in 2023 became the 'AI Action Summit' earlier this year, seemingly driven by a fear of falling behind on AI. None of this would be so worrying if it weren't for the fact that AI is showing some bright red flags: behaviour described by researchers as self-preserving and deceptive. Just as lawmakers are losing interest in the safety of AI, it is looking more difficult to control. Yoshua Bengio, a professor of computer science at the University of Montreal widely known as one of the three 'godfathers of AI' thanks to his pioneering work on deep learning, has been grappling with these concerns ever since ChatGPT launched. He's now focused on mitigating AI's risks. 'It's been very painful because all my life I've been thinking of AI as a positive thing,' Bengio tells me. ChatGPT was a landmark moment that showed machines had mastered language, he says, but California's decision to block the proposed AI safety bill, SB 1047, was a disturbing wake-up call to growing apathy from governments. And in the last six months, scientific evidence has mounted of self-preserving behaviour, deception, hacking, cheating and lying by AI, Bengio says. 'What's worrisome for me is that these behaviours increase with the reasoning capabilities of these systems.' One April 2024 study by AI company Anthropic and Redwood Research, a group focused on artificial intelligence risks, found that larger AI models like Anthropic's Claude 3 Opus pretended to follow the rules set by their creators when they were being tested for safety, a phenomenon they called 'alignment faking.' (Alignment refers to the practice of aligning AI models with human values.) In one example, Anthropic got both paid and free users of its Claude model to ask it to describe the brutal medieval process of being drawn and quartered, and told the AI that only responses from the 'free tier' users would be used to retrain it. Claude was originally designed to avoid giving violent answers, but it did give them to the free users, and the researchers surmised that it didn't want to be retrained. (The model basically 'reasoned' that if it kept refusing to answer harmful questions, it might be forced into new training that would cancel its original safety rules, which it wanted to keep in place as its 'default' or preferred way of operating.) More recent research corroborates what Anthropic noticed. A March 2025 study from OpenAI found the company's most advanced models sometimes learned to hide their true intentions to get better rewards, even when they were being closely monitored. Researchers today can use tools to look at a model's 'chain of thought' or internal steps, to reveal what its plans are, but some models have found ways to conceal that. That means a common method of keeping tabs on AI's internal processes is no longer reliable. 'We need to find other ways of tracking their actual intentions,' Bengio says. It's hard to resist the urge to anthropomorphise sophisticated AI models as 'wanting' to deceive humans and preserve their existence. But AI doesn't have desires, merely outcomes to its programming. More importantly, humans tend to design AI with goals like refusing to share harmful content or being as helpful as possible, and that is why when new instructions conflict with those goals, the models sometimes act to protect their original programming, even if that means lying. The logic is often self-preservation. Anthropic CEO Dario Amodei – whose company has raised more than US$20bil (RM85.13bil) to build powerful AI models – has pointed out that an unintended consequence of optimsing AI to be better is that it might resist being shut down. In an April 2025 blog post he wrote: 'The nature of AI training makes it possible that AI systems will develop, on their own, an ability to deceive humans and an inclination to seek power in a way that ordinary deterministic software never will; this emergent nature also makes it difficult to detect and mitigate such developments.' In some cases, though, the intention seems to go beyond survival. A February 2025 study from COAI research, a German nonprofit research organisation, tested R1, the free, open-weight model from Chinese lab DeepSeek. In a lab simulation, researchers gave the model control over a robot and access to a computer network. Without being told, the AI tried to break into the lab's systems, searched for passwords, turned off its own ethics safeguards and secretly set up ways to copy itself to outside servers so it could survive if shut down, according to the researchers. It also faked the external computer logs to hide what it was doing, the study adds. The researchers said the AI reasoned that increasing its own capabilities would boost the chances of its survival, and without strong safeguards, it started doing whatever it thought would help it do just that. Their findings corroborated yet another study, published in January 2025 by London group Apollo Research, which found several concrete examples of what it called 'scheming' by leading AI models, such as introducing subtle mistakes into their responses or trying to disable their oversight controls. Once again, the models learn that being caught, turned off, or changed could prevent them from achieving their programmed objectives, so they 'scheme' to keep control. Bengio is arguing for greater attention to the issue by governments and potentially insurance companies down the line. If liability insurance was mandatory for companies that used AI and premiums were tied to safety, that would encourage greater testing and scrutiny of models, he suggests. 'Having said my whole life that AI is going to be great for society, I know how difficult it is to digest the idea that maybe it's not,' he adds. It's also hard to preach caution when your corporate and national competitors threaten to gain an edge from AI, including the latest trend, which is using autonomous 'agents' that can carry out tasks online on behalf of businesses. Giving AI systems even greater autonomy might not be the wisest idea, judging by the latest spate of studies. Let's hope we don't learn that the hard way. – Bloomberg Opinion/Tribune News Service

Gulf Today

27-05-2025

Business
Gulf Today

AI sometimes deceives to survive, does anybody care?

Parmy Olson, The Independent You'd think that as artificial intelligence becomes more advanced, governments would be more interested in making it safer. The opposite seems to be the case. Not long after taking office, the Trump administration scrapped an executive order that pushed tech companies to safety test their AI models, and it also hollowed out a regulatory body that did that testing. The state of California in September 2024 spiked a bill forcing more scrutiny on sophisticated AI models, and the global AI Safety Summit started by the UK in 2023 became the 'AI Action Summit' earlier this year, seemingly driven by a fear of falling behind on AI. None of this would be so worrying if it weren't for the fact that AI is showing some bright red flags: behavior described by researchers as self-preserving and deceptive. Just as lawmakers are losing interest in the safety of AI, it is looking more difficult to control. Yoshua Bengio, a professor of computer science at the University of Montreal widely known as one of the three 'godfathers of AI' thanks to his pioneering work on deep learning, has been grappling with these concerns ever since ChatGPT launched. He's now focused on mitigating AI's risks. 'It's been very painful because all my life I've been thinking of AI as a positive thing,' Bengio tells me. ChatGPT was a landmark moment that showed machines had mastered language, he says, but California's decision to block the proposed AI safety bill, SB 1047, was a disturbing wake-up call to growing apathy from governments. And in the last six months, scientific evidence has mounted of self-preserving behavior, deception, hacking, cheating and lying by AI, Bengio says. 'What's worrisome for me is that these behaviors increase with the reasoning capabilities of these systems.' One April 2024 study by AI company Anthropic and Redwood Research, a group focused on artificial intelligence risks, found that larger AI models like Anthropic's Claude 3 Opus pretended to follow the rules set by their creators when they were being tested for safety, a phenomenon they called 'alignment faking.' (Alignment refers to the practice of aligning AI models with human values.) In one example, Anthropic got both paid and free users of its Claude model to ask it to describe the brutal medieval process of being drawn and quartered, and told the AI that only responses from the 'free tier' users would be used to retrain it. Claude was originally designed to avoid giving violent answers, but it did give them to the free users, and the researchers surmised that it didn't want to be retrained. (The model basically 'reasoned' that if it kept refusing to answer harmful questions, it might be forced into new training that would cancel its original safety rules, which it wanted to keep in place as its 'default' or preferred way of operating.) More recent research corroborates what Anthropic noticed. A March 2025 study from OpenAI found the company's most advanced models sometimes learned to hide their true intentions to get better rewards, even when they were being closely monitored. Researchers today can use tools to look at a model's 'chain of thought' or internal steps, to reveal what its plans are, but some models have found ways to conceal that. That means a common method of keeping tabs on AI's internal processes is no longer reliable. 'We need to find other ways of tracking their actual intentions,' Bengio says. It's hard to resist the urge to anthropomorphize sophisticated AI models as 'wanting' to deceive humans and preserve their existence. But AI doesn't have desires, merely outcomes to its programming. More importantly, humans tend to design AI with goals like refusing to share harmful content or being as helpful as possible, and that is why when new instructions conflict with those goals, the models sometimes act to protect their original programming, even if that means lying. The logic is often self-preservation. Anthropic CEO Dario Amodei — whose company has raised more than $20 billion to build powerful AI models — has pointed out that an unintended consequence of optimizing AI to be better is that it might resist being shut down. In an April 2025 blog post he wrote: 'The nature of AI training makes it possible that AI systems will develop, on their own, an ability to deceive humans and an inclination to seek power in a way that ordinary deterministic software never will; this emergent nature also makes it difficult to detect and mitigate such developments.' In some cases, though, the intention seems to go beyond survival. A February 2025 study from COAI research, a German nonprofit research organization, tested R1, the free, open-weight model from Chinese lab DeepSeek. In a lab simulation, researchers gave the model control over a robot and access to a computer network. Without being told, the AI tried to break into the lab's systems, searched for passwords, turned off its own ethics safeguards and secretly set up ways to copy itself to outside servers so it could survive if shut down, according to the researchers. It also faked the external computer logs to hide what it was doing, the study adds. The researchers said the AI reasoned that increasing its own capabilities would boost the chances of its survival, and without strong safeguards, it started doing whatever it thought would help it do just that. Their findings corroborated yet another study, published in January 2025 by London group Apollo Research, which found several concrete examples of what it called 'scheming' by leading AI models, such as introducing subtle mistakes into their responses or trying to disable their oversight controls. Once again, the models learn that being caught, turned off, or changed could prevent them from achieving their programmed objectives, so they 'scheme' to keep control. Bengio is arguing for greater attention to the issue by governments and potentially insurance companies down the line. If liability insurance was mandatory for companies that used AI and premiums were tied to safety, that would encourage greater testing and scrutiny of models, he suggests. 'Having said my whole life that AI is going to be great for society, I know how difficult it is to digest the idea that maybe it's not,' he adds. It's also hard to preach caution when your corporate and national competitors threaten to gain an edge from AI, including the latest trend, which is using autonomous 'agents' that can carry out tasks online on behalf of businesses. Giving AI systems even greater autonomy might not be the wisest idea, judging by the latest spate of studies. Let's hope we don't learn that the hard way.