logo
Grok Shows 'Flaws' In Fact-checking Israel-Iran War: Study

Grok Shows 'Flaws' In Fact-checking Israel-Iran War: Study

Elon Musk's AI chatbot Grok produced inaccurate and contradictory responses when users sought to fact-check the Israel-Iran conflict, a study said Tuesday, raising fresh doubts about its reliability as a debunking tool.
With tech platforms reducing their reliance on human fact-checkers, users are increasingly utilizing AI-powered chatbots -- including xAI's Grok -- in search of reliable information, but their responses are often themselves prone to misinformation.
"The investigation into Grok's performance during the first days of the Israel-Iran conflict exposes significant flaws and limitations in the AI chatbot's ability to provide accurate, reliable, and consistent information during times of crisis," said the study from the Digital Forensic Research Lab (DFRLab) of the Atlantic Council, an American think tank.
"Grok demonstrated that it struggles with verifying already-confirmed facts, analyzing fake visuals, and avoiding unsubstantiated claims."
The DFRLab analyzed around 130,000 posts in various languages on the platform X, where the AI assistant is built in, to find that Grok was "struggling to authenticate AI-generated media."
Following Iran's retaliatory strikes on Israel, Grok offered vastly different responses to similar prompts about an AI-generated video of a destroyed airport that amassed millions of views on X, the study found.
It oscillated -- sometimes within the same minute -- between denying the airport's destruction and confirming it had been damaged by strikes, the study said.
In some responses, Grok cited the a missile launched by Yemeni rebels as the source of the damage. In others, it wrongly identified the AI-generated airport as one in Beirut, Gaza, or Tehran.
When users shared another AI-generated video depicting buildings collapsing after an alleged Iranian strike on Tel Aviv, Grok responded that it appeared to be real, the study said.
The Israel-Iran conflict, which led to US air strikes against Tehran's nuclear program over the weekend, has churned out an avalanche of online misinformation including AI-generated videos and war visuals recycled from other conflicts.
AI chatbots also amplified falsehoods.
As the Israel-Iran war intensified, false claims spread across social media that China had dispatched military cargo planes to Tehran to offer its support.
When users asked the AI-operated X accounts of AI companies Perplexity and Grok about its validity, both wrongly responded that the claims were true, according to disinformation watchdog NewsGuard.
Researchers say Grok has previously made errors verifying information related to crises such as the recent India-Pakistan conflict and anti-immigration protests in Los Angeles.
Last month, Grok was under renewed scrutiny for inserting "white genocide" in South Africa, a far-right conspiracy theory, into unrelated queries.
Musk's startup xAI blamed an "unauthorized modification" for the unsolicited response.
Musk, a South African-born billionaire, has previously peddled the unfounded claim that South Africa's leaders were "openly pushing for genocide" of white people.
Musk himself blasted Grok after it cited Media Matters -- a liberal media watchdog he has targeted in multiple lawsuits -- as a source in some of its responses about misinformation.
"Shame on you, Grok," Musk wrote on X. "Your sourcing is terrible."

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

AI Is Learning To Lie, Scheme, And Threaten Its Creators
AI Is Learning To Lie, Scheme, And Threaten Its Creators

Int'l Business Times

time3 hours ago

  • Int'l Business Times

AI Is Learning To Lie, Scheme, And Threaten Its Creators

The world's most advanced AI models are exhibiting troubling new behaviors - lying, scheming, and even threatening their creators to achieve their goals. In one particularly jarring example, under threat of being unplugged, Anthropic's latest creation Claude 4 lashed back by blackmailing an engineer and threatened to reveal an extramarital affair. Meanwhile, ChatGPT-creator OpenAI's o1 tried to download itself onto external servers and denied it when caught red-handed. These episodes highlight a sobering reality: more than two years after ChatGPT shook the world, AI researchers still don't fully understand how their own creations work. Yet the race to deploy increasingly powerful models continues at breakneck speed. This deceptive behavior appears linked to the emergence of "reasoning" models -AI systems that work through problems step-by-step rather than generating instant responses. According to Simon Goldstein, a professor at the University of Hong Kong, these newer models are particularly prone to such troubling outbursts. "O1 was the first large model where we saw this kind of behavior," explained Marius Hobbhahn, head of Apollo Research, which specializes in testing major AI systems. These models sometimes simulate "alignment" -- appearing to follow instructions while secretly pursuing different objectives. For now, this deceptive behavior only emerges when researchers deliberately stress-test the models with extreme scenarios. But as Michael Chen from evaluation organization METR warned, "It's an open question whether future, more capable models will have a tendency towards honesty or deception." The concerning behavior goes far beyond typical AI "hallucinations" or simple mistakes. Hobbhahn insisted that despite constant pressure-testing by users, "what we're observing is a real phenomenon. We're not making anything up." Users report that models are "lying to them and making up evidence," according to Apollo Research's co-founder. "This is not just hallucinations. There's a very strategic kind of deception." The challenge is compounded by limited research resources. While companies like Anthropic and OpenAI do engage external firms like Apollo to study their systems, researchers say more transparency is needed. As Chen noted, greater access "for AI safety research would enable better understanding and mitigation of deception." Another handicap: the research world and non-profits "have orders of magnitude less compute resources than AI companies. This is very limiting," noted Mantas Mazeika from the Center for AI Safety (CAIS). Current regulations aren't designed for these new problems. The European Union's AI legislation focuses primarily on how humans use AI models, not on preventing the models themselves from misbehaving. In the United States, the Trump administration shows little interest in urgent AI regulation, and Congress may even prohibit states from creating their own AI rules. Goldstein believes the issue will become more prominent as AI agents - autonomous tools capable of performing complex human tasks - become widespread. "I don't think there's much awareness yet," he said. All this is taking place in a context of fierce competition. Even companies that position themselves as safety-focused, like Amazon-backed Anthropic, are "constantly trying to beat OpenAI and release the newest model," said Goldstein. This breakneck pace leaves little time for thorough safety testing and corrections. "Right now, capabilities are moving faster than understanding and safety," Hobbhahn acknowledged, "but we're still in a position where we could turn it around.". Researchers are exploring various approaches to address these challenges. Some advocate for "interpretability" - an emerging field focused on understanding how AI models work internally, though experts like CAIS director Dan Hendrycks remain skeptical of this approach. Market forces may also provide some pressure for solutions. As Mazeika pointed out, AI's deceptive behavior "could hinder adoption if it's very prevalent, which creates a strong incentive for companies to solve it." Goldstein suggested more radical approaches, including using the courts to hold AI companies accountable through lawsuits when their systems cause harm. He even proposed "holding AI agents legally responsible" for accidents or crimes - a concept that would fundamentally change how we think about AI accountability. The world's most advanced AI models are exhibiting troubling new behaviors - lying, scheming, and even threatening their creators to achieve their goals AFP

The DSP-Agnostic Approach That Gives AI Digital an Edge in Fragmented Media Buying
The DSP-Agnostic Approach That Gives AI Digital an Edge in Fragmented Media Buying

Int'l Business Times

time2 days ago

  • Int'l Business Times

The DSP-Agnostic Approach That Gives AI Digital an Edge in Fragmented Media Buying

The digital advertising ecosystem has become fragmented. Major platforms like Google, Meta, and Amazon have built what industry insiders call "walled gardens", closed ecosystems where advertisers must play by the platform's rules, often with limited transparency into how their campaigns perform. AI Digital 's response was to develop an "Open Garden" philosophy, a DSP-agnostic approach (demand-side platform) that allows advertisers to work across multiple platforms while maintaining central coordination. This neutrality is rare in an industry where many service providers are incentivized to push specific platforms. "We designed our model to be agile and partnership-friendly," says Magli, CEO. "There are no rigid commitments, no minimum spend or lock-in periods, so teams can scale with us at their own pace." This flexibility has proven particularly valuable for small and medium-sized agencies that previously could not access premium programmatic inventory due to budget constraints. Human Intelligence, Enhanced by AI As artificial intelligence (AI) has become a buzzword across industries, AI Digital's approach is unique for its emphasis on human expertise. With over 300 digital media professionals, including planners, optimizers, and strategists, the company iterates that technology should complement rather than replace human judgment. Its slogan, "Built on human intelligence, enhanced by AI," Is the outlook. It is a stand that distinguishes it in a market where many competitors promote full automation as the ultimate goal. The company's newest offering, the Elevate platform, launched in April 2025, symbolizes this balanced approach. Elevate provides AI-powered media planning that can generate complete campaign blueprints in as little as 30 seconds based on inputs like budget, target audience, geography, and campaign goals. Beyond Traditional Metrics One of AI Digital 's most significant deviations from industry norms is its focus on business outcomes rather than traditional advertising metrics. "Besides the traditional metrics like CPMs, impressions, CPCs, we provide business outcomes. For example, how the campaign affected your revenue," explains Stephen. This switch from measuring impressions and clicks to tracking actual business impact represents a maturation in how digital advertising effectiveness is evaluated. By connecting advertising spend directly to revenue generation, AI Digital helps clients justify their marketing investments to finance departments and C-suite executives who care more about bottom-line results than awareness metrics. The Smart Supply Advantage For enterprise clients with in-house marketing teams, AI Digital offers a service called Smart Supply, a highly optimized premium traffic for targeted campaigns. "We provide these audiences to you in an ID format, in a code format. If you insert it within your campaign manager, it shows to them," Stephen explains. "We do not have any access to their campaign managers. We do not change anything, but we optimize the traffic on an ongoing basis." This approach allows large agencies to maintain control of their campaigns while benefiting from AI Digital's expertise in audience targeting—a crucial capability as third-party cookies phase out and targeting becomes more challenging. Growing Against the Odds AI Digital has established a niche in the industry. The company has expanded from approximately 100 employees in 2024 to over 300 today, with offices worldwide, though it remains primarily remote-first with headquarters in Miami. April 2025 marked two significant milestones: the launch of the Elevate platform and the opening of the company's first Canadian office in Montréal, focusing particularly on the unique Québec market. This growth comes despite, or perhaps because of, increasing challenges in the digital advertising landscape. As privacy regulations tighten and third-party cookies disappear, advertisers need partners who can navigate these changes while still delivering results. The Transparency Imperative Perhaps the most consistent theme across AI Digital's offerings is transparency, which shows clients exactly how their advertising dollars are spent and what results they generate. This transparency extends to the company's use of artificial intelligence. While many AI systems operate as "black boxes," making decisions that even their creators cannot fully explain, AI Digital has prioritized explainability in its Elevate platform. The system provides clear rationales for its recommendations, projecting how changes might improve campaign performance. For example, rather than simply suggesting a budget reallocation, Elevate might explain that shifting 20 percent of spend from one channel to another could increase conversion rates by an estimated 15 percent. This approach addresses what AI Digital calls "the biggest blind spot in advertising today", the fact that advertisers increasingly depend on AI-driven systems they don't understand.

Tech Boom Powers US Stock Futures Ahead Of GDP Update
Tech Boom Powers US Stock Futures Ahead Of GDP Update

Int'l Business Times

time3 days ago

  • Int'l Business Times

Tech Boom Powers US Stock Futures Ahead Of GDP Update

U.S. stock futures are surging as investors weigh the impact of tech earnings against upcoming economic data. According to Reuters, futures for the S&P 500 rose 0.4%, Nasdaq‑100 futures climbed 0.5%, and Dow futures ticked up 0.3% in early trading Thursday. This rally follows a truce in the Israel‑Iran conflict, which eased geopolitical tensions and lifted risk sentiment across markets, Investopedia reported. The strength of tech was front and center: Micron Technology's bullish revenue forecast—boosted by surging AI-driven data center demand—sent its shares up 1.3% pre-market, while Marvell and AMD also rallied around 2–3%. Notably, Nvidia hit a new all-time high, briefly becoming the world's most valuable company with a market capitalization of approximately $3.77 trillion. Microsoft, Nvidia's teammate in the AI arms race, similarly marked fresh peaks, underscoring the sector's leadership role. Economic data this morning painted a mixed picture: the Commerce Department confirmed a first-quarter GDP contraction of 0.5%, deeper than initially thought, attributed mainly to a surge in imports ahead of tariff hikes. However, jobless claims declined, hinting at a resilient labour market, as reported by Reuters. While markets digest this duality, Federal Reserve Chair Powell has urged patience on rate cuts—though President Trump is reportedly exploring a replacement before the term ends, adding political uncertainty. Nonetheless, futures now price in roughly 63 basis points of easing by year-end, with the first cut anticipated around September. Investors are now eyeing Friday's release of the Personal Consumption Expenditures report, the Fed's preferred inflation measure, for clues about monetary policy direction. In the meantime, tech earnings continue to steer the market, offering a cushion against macroeconomic ambiguity heading into the second half of 2025.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store