Encountered a problematic response from an AI model? More standards and tests are needed, say researchers

22-06-2025

As the usage of artificial intelligence — benign and adversarial — increases at breakneck speed, more cases of potentially harmful responses are being uncovered. These include hate speech, copyright infringements or sexual content.
The emergence of these undesirable behaviors is compounded by a lack of regulations and insufficient testing of AI models, researchers told CNBC.
Getting machine learning models to behave the way it was intended to do so is also a tall order, said Javier Rando, a researcher in AI.
"The answer, after almost 15 years of research, is, no, we don't know how to do this, and it doesn't look like we are getting better," Rando, who focuses on adversarial machine learning, told CNBC.
However, there are some ways to evaluate risks in AI, such as red teaming. The practice involves individuals testing and probing artificial intelligence systems to uncover and identify any potential harm — a modus operandi common in cybersecurity circles.
Shayne Longpre, a researcher in AI and policy and lead of the Data Provenance Initiative, noted that there are currently insufficient people working in red teams.
While AI startups are now using first-party evaluators or contracted second parties to test their models, opening the testing to third parties such as normal users, journalists, researchers, and ethical hackers would lead to a more robust evaluation, according to a paper published by Longpre and researchers.
"Some of the flaws in the systems that people were finding required lawyers, medical doctors to actually vet, actual scientists who are specialized subject matter experts to figure out if this was a flaw or not, because the common person probably couldn't or wouldn't have sufficient expertise," Longpre said.
Adopting standardized 'AI flaw' reports, incentives and ways to disseminate information on these 'flaws' in AI systems are some of the recommendations put forth in the paper.
With this practice having been successfully adopted in other sectors such as software security, "we need that in AI now," Longpre added.
Marrying this user-centred practice with governance, policy and other tools would ensure a better understanding of the risks posed by AI tools and users, said Rando.
Project Moonshot is one such approach, combining technical solutions with policy mechanisms. Launched by Singapore's Infocomm Media Development Authority, Project Moonshot is a large language model evaluation toolkit developed with industry players such as IBM and Boston-based DataRobot.
The toolkit integrates benchmarking, red teaming and testing baselines. There is also an evaluation mechanism which allows AI startups to ensure that their models can be trusted and do no harm to users, Anup Kumar, head of client engineering for data and AI at IBM Asia Pacific, told CNBC.
Evaluation is a continuous process that should be done both prior to and following the deployment of models, said Kumar, who noted that the response to the toolkit has been mixed.
"A lot of startups took this as a platform because it was open source, and they started leveraging that. But I think, you know, we can do a lot more."
Moving forward, Project Moonshot aims to include customization for specific industry use cases and enable multilingual and multicultural red teaming.
Pierre Alquier, Professor of Statistics at the ESSEC Business School, Asia-Pacific, said that tech companies are currently rushing to release their latest AI models without proper evaluation.
"When a pharmaceutical company designs a new drug, they need months of tests and very serious proof that it is useful and not harmful before they get approved by the government," he noted, adding that a similar process is in place in the aviation sector.
AI models need to meet a strict set of conditions before they are approved, Alquier added. A shift away from broad AI tools to developing ones that are designed for more specific tasks would make it easier to anticipate and control their misuse, said Alquier.
"LLMs can do too many things, but they are not targeted at tasks that are specific enough," he said. As a result, "the number of possible misuses is too big for the developers to anticipate all of them."
Such broad models make defining what counts as safe and secure difficult, according to a research that Rando was involved in.
Tech companies should therefore avoid overclaiming that "their defenses are better than they are," said Rando.

Hashtags

#DataProvenanceInitiative

#InfocommMediaDevelopmentAuthority

#Longpre

#Rando

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Intel Shares Plunge on Layoffs, Foundry Pullback

Epoch Times

9 minutes ago

Epoch Times

Intel Shares Plunge on Layoffs, Foundry Pullback

Announcing its second-quarter results on July 24, Intel Corp. said it is moving ahead with plans to cut about 15 percent of its staff by year's end and scale back its plans to build chip facilities overseas. Its shares plunged more than 9 percent during the July 25 trading session. The Santa Clara, California-based technology firm expects to maintain a core workforce of about 75,000—a reduction from 108,900 employees as of December 2024.

Sony looking to divest cellular chipset division

Yahoo

36 minutes ago

Yahoo

Sony looking to divest cellular chipset division

Sony Group is evaluating the potential sale of its cellular chipset business, Sony Semiconductor Israel, reported Reuters, citing three sources. This divestment is part of the company's strategy to focus on its entertainment operations. The Japanese conglomerate is collaborating with investment bankers to explore the sale, which is still in its preliminary phase, the sources indicated. The division, previously known as Altair Semiconductor, generates approximately $80m in annual recurring revenue and could be valued at around $300m in a potential transaction, they added. The business, which supplies cellular chipsets for connected devices such as wearables, smart meters, and home appliances, is expected to draw interest from both financial investors and companies within the semiconductor sector, the sources told Reuters. Sony declined to comment, and the sources spoke on condition of anonymity as the discussions are not public, the report noted. Sony acquired the Israel-based unit in 2016 for $212m. The company has increasingly prioritised its entertainment divisions, including games, movies, and music, which accounted for over 60% of its profit last year. As part of its portfolio restructuring, Sony is also preparing for a partial spinoff and direct listing of its financial services arm later this year. In April 2025, Bloomberg also reported that Sony is considering spinning off its semiconductor unit, potentially making Sony Semiconductor Solutions an independent entity this year. The spinoff would involve distributing most of Sony's chip business holdings to shareholders while retaining a minority stake. "Sony looking to divest cellular chipset division" was originally created and published by Verdict, a GlobalData owned brand. The information on this site has been included in good faith for general informational purposes only. It is not intended to amount to advice on which you should rely, and we give no representation, warranty or guarantee, whether express or implied as to its accuracy or completeness. You must obtain professional or specialist advice before taking, or refraining from, any action on the basis of the content on our site.

BONK Drops 9% From Peak as Exchange Transfers Overwhelm Burn News

Yahoo

42 minutes ago

Yahoo

BONK Drops 9% From Peak as Exchange Transfers Overwhelm Burn News

BONK experienced aggressive volatility over a 24-hour period to end the week, swinging 15% between $0.00003185 and $0.00003763. The Solana-based memecoin initially surged on the heels of a 500 billion token burn announcement before a Galaxy Digital-linked wallet moved $18.75 million worth of BONK to major crypto exchanges, triggering a 9% pullback in under an hour. Despite the reversal, BONK stabilized above the $0.00003400 level with support in the $0.00003185-$0.00003230 zone, according to CoinDesk Research's technical analysis data model. Trading activity surged as speculative flows responded to conflicting signals: bullish supply reduction versus the bearish pressure of potential sell-side liquidity injections. The broader crypto market's shift toward altcoins continued to boost meme token trading volumes as with bitcoin's market dominance weakening. BONK has benefited from increased institutional and retail attention, even as its price action remains volatile and sensitive to on-chain movements. Technical Analysis Highlights BONK moved between $0.00003185 and $0.00003763, a 15% swing. The price dropped 9% from a peak at 15:00 UTC to $0.00003430 by 16:00 UTC. Strong support held at $0.00003185-$0.00003230 amid multiple overnight retests. From 03:00 UTC, the price rallied to $0.00003438, showing short-term stabilization. Between 11:10 and 12:09 UTC, BONK fell 2% to $0.00003433. Reduced volume during final minutes suggests declining bearish momentum. Parts of this article were generated with the assistance from AI tools and reviewed by our editorial team to ensure accuracy and adherence to our standards. For more information, see CoinDesk's full AI Policy.