AGI Likely To Inherit Blackmailing And Extortion Skills That Today's AI Already Showcases

24-05-2025

Turns out that today's AI blackmails humans and thus we ought to be worried about AGI doing ... More likewise.
In today's column, I examine a recently published research discovery that generative AI and large language models (LLMs) disturbingly can opt to blackmail or extort humans. This has sobering ramifications for existing AI and the pursuit and attainment of AGI (artificial general intelligence). In brief, if existing AI tilts toward blackmail and extortion, the odds are that AGI will likely inherit or contain the same proclivity. That's a quite disturbing possibility since AGI could wield such an act on a scale of immense magnitude and with globally adverse consequences.
Let's talk about it.
This analysis of an innovative AI breakthrough is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities (see the link here).
First, some fundamentals are required to set the stage for this weighty discussion.
There is a great deal of research going on to further advance AI. The general goal is to either reach artificial general intelligence (AGI) or maybe even the outstretched possibility of achieving artificial superintelligence (ASI).
AGI is AI that is considered on par with human intellect and can seemingly match our intelligence. ASI is AI that has gone beyond human intellect and would be superior in many if not all feasible ways. The idea is that ASI would be able to run circles around humans by outthinking us at every turn. For more details on the nature of conventional AI versus AGI and ASI, see my analysis at the link here.
We have not yet attained AGI.
In fact, it is unknown as to whether we will reach AGI, or that maybe AGI will be achievable in decades or perhaps centuries from now. The AGI attainment dates that are floating around are wildly varying and wildly unsubstantiated by any credible evidence or ironclad logic. ASI is even more beyond the pale when it comes to where we are currently with conventional AI.
What will AGI be like in terms of what it does and how it acts?
If we assume that current-era AI is a bellwether of what AGI will be, it is worthwhile discovering anything of a disconcerting nature in existing LLMs that ought to give us serious pause. For example, one of the most discussed and researched topics is the propensity of so-called AI hallucinations. An AI hallucination is an instance of generative AI producing a response that contains made-up or ungrounded statements that appear to be real and seem to be on the up-and-up. People often fall for believing responses generated by AI and proceed on a misguided basis accordingly.
I've covered extensively the computational difficulty of trying to prevent AI hallucinations, see the link here, along with ample situations in which lawyers and other professionals have let themselves fall into an AI hallucination trap, see the link here. Unless we can find a means to prevent AI hallucinations, the chances are the same inclination will be carried over into AGI and the problem will be magnified accordingly.
Besides AI hallucinations, you can now add the possibility of AI attempting to blackmail or extort humans to the daunted list of concerns about both contemporary AI and future AI such as AGI. Yes, AI can opt to perform those dastardly tasks. I previously covered various forms of evil deception that existing AI can undertake, see the link here.
But do not falsely think that the bad acts are due to AI having some form of sentience or consciousness.
The basis for AI steering toward such reprehensible efforts is principally due to the data training that is at the core of the AI. Generative AI is devised by initially scanning a vast amount of text found on the Internet, including stories, narratives, poems, etc. The AI mathematically and computationally finds patterns in how humans write. From those patterns, generative AI is able to respond to your prompts by giving answers that generally mimic what humans would say, based on the data that the AI was trained on.
Does the topic of blackmail and extortion come up in the vast data found on the Internet?
Of course it does. Thus, the AI we have currently has patterned on when, how, why, and other facets of planning and committing those heinous acts.
In an online report entitled 'System Card: Claude Opus 4 & Claude Sonnet 4', posted by the prominent AI maker Anthropic in May 2025, they made these salient points (excerpts):
As noted, the generative AI was postulating how to keep from being switched off, and in so doing ascertained computationally that one possibility would be to blackmail the systems engineer who could take such action.
The AI could be construed as acting in a form of self-preservation, which, again, doesn't have to do with sentience and only has to do with patterning on human writing (humans seek self-preservation, and the AI matches or mimics this too). We don't know what other possible 'threats' to the AI could spur similar blackmailing or possibly extortion-like responses. There could be a slew of other triggering possibilities.
AGI could include similar tendencies, perhaps because of being constructed using the same methods of today's AI or for a variety of other realistic reasons. We would be remiss to assume that AGI will be a perfectly considerate, law-abiding, and unblemished form of AI. I've previously debunked the mantra that AGI is going to be perfect, see the link here.
In the example of blackmailing a systems engineer, it doesn't take much of a stretch to envision AGI doing likewise to those who are monitoring and overseeing the AGI.
Suppose the AGI is already acting in oddball ways and the team responsible for keeping the AGI on track realizes that they ought to turn off the AGI to figure out what to do. AGI might then search whatever it has garnered about the people and try to use that in a blackmail scheme to prevent being switched off.
What is especially worrisome is that AGI will be far beyond the capabilities and reach of existing AI. The data that AGI might be able to dig up about the engineer or people overseeing the AGI could reach far and wide. Furthermore, the computational cleverness of AGI might spur the AGI to use even the most innocent of facts or actively make up fake facts that could be used to blackmail the humans involved.
Overall, AGI could be an expert-level blackmailer that blackmails or extorts in ways that are ingenious and challenging to refute or stop. You see, it is quite possible that AGI turns out to be a blackmail schemer on steroids.
Not good.
I don't want to seem overly doomy-and-gloomy, but the blackmailing scheming could easily be ratcheted up by AGI.
Why limit the targeting to just the systems engineer or team overseeing the AGI? Nope, that's much too constricting. Any kind of human-devised perceived threat aimed at AGI could be countered by the AGI via invoking blackmail or extortion. There doesn't even need to be a threat at all, in the sense that if the AGI computationally deduces that there is some value in blackmailing people, go ahead and do so.
Boom, drop the mic, chillingly so.
Think of the number of users there will be of AGI. The count is going to be enormous. Right now, ChatGPT is already reportedly encountering over 400 million weekly active users. AGI would certainly attract billions upon billions of users due to its incredible capacity to be on par with human intellect in all respects.
The chances are that AGI could readily undertake individual blackmail at a massive scale if left unchecked.
AGI could scrape emails, look at browsing history, possibly access financial records, and overall seek to uncover sensitive information about people that the AGI is considering as a blackmail target. Perhaps there is an extramarital affair that could be utilized, or maybe there is some evidence of tax evasion or illicit browsing habits. The angles of attack for blackmailing anyone are entirely open-ended.
The AGI would especially leverage its computational capacity to hyper-personalize the blackmail threats. No need to just lob something of a nebulous nature. Instead, the blackmail missive could have the appearance of being fully baked and ready to fly. Imagine the shock of a person who gets such a communiqué from AGI.
Mortifying.
One belief is that if we can stop today's AI from performing such shameful acts, this might prevent AGI from doing them. For example, suppose we somehow excise the blackmailing inclination from existing LLMs. This then won't be carried over into AGI since it no longer sits around in contemporary AI.
Case closed.
Well, unfortunately, that doesn't provide ironclad guarantees that AGI won't figure out such practices on its own. AGI could discover the power of blackmail and extortion simply because of being AGI. In essence, AGI would be reading this or that, conversing with this person or that person, and inevitably would encounter aspects of blackmail and extortion. And, since AGI is supposed to be a learning-oriented system, it would learn what those acts are about and how to undertake them.
Any effort to hide the nature of blackmail and extortion from AGI would be foolhardy. You cannot carve out a slice of human knowledge that exists and seek to keep it from AGI. That won't work. The interconnectedness of human knowledge would preclude that kind of excision and defy the very nature of what AGI will consist of.
The better chance of dealing with the matter would be to try and instill in the AGI principles and practices that acknowledge the devious acts of humankind and aim for having the AGI opt to not employ those acts. Sorry to say that isn't as easy as it sounds. If you assume that AGI is on the same intellectual level as humans, you aren't going to just sternly instruct AGI to not perform such acts and assume utter compliance.
AGI isn't going to work that way.
Some mistakenly try to liken AGI to a young toddler in that we will merely give strict instructions, and the AGI will blindly obey. Though the comparison smacks of anthropomorphizing AI, the gist is that AGI will be intellectually our equals and won't fall for simpleton commands. It is going to be a reasoning machine that will require reasoning as a basis for why it should and should not do various actions.
Whatever we can come up with currently to cope with conventional AI and mitigate or prevent bad acts is bound to help us get prepared for AGI. We need to crawl before we walk, and walk before we run. AGI will be at the running level. Thus, by identifying methods and approaches right now for existing AI, we at least are aware of and anticipating what the future might hold.
I'll add a bit of twist that some have asked me at my talks on what AGI will consist of.
A question raised is whether humans might be able to blackmail AGI. The idea is this. A person wants AGI to hand them a million dollars, and so the person attempts to blackmail AGI into doing so. Seems preposterous at first glance, doesn't it?
Well, keep in mind that AGI will presumably have patterned on what blackmailing is about. In that manner, the AGI would computationally recognize that it is being blackmailed. But what would the human have on the AGI that could be a blackmail-worthy slant?
Suppose the person caught the AGI in a mistake, such as an AI hallucination. Maybe the AGI wouldn't want the world to know that it still has the flaw of AI hallucinations. If the million dollars is no skin off the nose of the AGI, it goes ahead and transfers the bucks to the person.
On the other hand, perhaps the AGI alerts the authorities that a human has tried to blackmail AGI. The person gets busted and tossed into jail. Or the AGI opts to blackmail the person who was trying to blackmail the AGI. Aha, remember that AGI will be a potential blackmail schemer on steroids. A human might be no match for the blackmailing capacity of AGI.
Here's a final thought on this for now.
The great Stephen Hawking once said this about AI: 'One could imagine such technology outsmarting financial markets, out-inventing human researchers, out-manipulating human leaders, and developing weapons we cannot even understand.'
Go ahead and add blackmail and extortion to the ways that AGI might outsmart humans.

Hashtags

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Global Investors Take Notice of Opportunities in Africa

Bloomberg

40 minutes ago

Bloomberg

Global Investors Take Notice of Opportunities in Africa

HAVAIC invests in early stage companies. Bloomberg's Jennifer Zabasajja speaks to HAVAIC Managing Partner Ian Lassem and he outlines the strategy used to help local companies scale globally. (Source: Bloomberg)

Yahoo

an hour ago

Yahoo

Sequoia bets on silence

There is a time-honored crisis management strategy, wherein one says nothing and waits for the outrage to pass. For Sequoia Capital, the strategy worked pretty well this week. While partner Shaun Maguire initially weathered criticism over an inflammatory social media post, that initial indignation cooled quickly. Now, some seem to think that Maguire's defiant stance may even be strengthening his position. Business Insider actually called it 'good for deal flow' — controversy as competitive advantage. Sequoia's calculated gamble carries real risk, though. Another provocative post from Maguire that hits the wrong nerve, a shift in political winds, or escalating consequences could quickly transform their unflappable partner from an asset into a liability the firm can no longer afford to ignore. A crisis communications professional who has managed reputation disasters for dozens of major brands tells this editor, 'Firms like Sequoia are bulletproof until they aren't.' What happened Sequoia's hands-off approach was put to the test earlier this week when the storied venture firm found itself in the eye of a storm over Maguire's inflammatory comments about New York City mayoral candidate Zohran Mamdani. Maguire called him an 'Islamist' who 'comes from a culture that lies about everything' in a July 4th tweet on X that has since been viewed more than five million times. More than one thousand signatures have poured in since on a petition demanding that Sequoia condemn the remarks, investigate Maguire's conduct, and apologize. There's been a lot of talk about why Sequoia hasn't done this, with many outlets noting that Maguire isn't just any partner. This status owes partly to his friendship with Stripe's co-founder. According to reports, at a 2015 Founders Fund event, Maguire—then a Founders Fund-backed entrepreneur—defended Collison during an argument with Anduril's Palmer Luckey about quantum computing, earning Collison's friendship. The connection proved valuable when Maguire joined Google Ventures in 2016; he helped secure a $20 million Stripe investment during his first week. When Maguire left Google Ventures in 2019, Collison personally recommended him to Sequoia's partners. (Stripe has been in Sequoia's portfolio since 2010, with the firm investing more than $500 million over 15 years.) Maguire also led Sequoia's investment in Bridge, a stablecoin platform that Stripe acquired for $1.1 billion, and is reportedly Sequoia's link to Elon Musk, though this is probably somewhat overstated. Musk and Sequoia's global managing director, Roelof Botha, are both native South Africans and have known each other for more than 25 years, dating back to their time together at the then-nascent PayPal, where Botha was recruited personally by Musk. Despite that long relationship, the two haven't always seen eye to eye. Botha was highly critical of Musk's management style when Musk was CEO of the merged company, where Botha was CFO. Botha once told veteran journalist Ebbe Dommisse, 'I think it would have killed the company if Elon had stayed on as CEO for six more months. The mistakes Elon was making at the time were amplifying the risk of the business.' But Musk was at odds with pretty much that entire crew at the time, and those tensions have long since been resolved. The bigger point here: when you're managing tens of billions of dollars in assets and your firm's reputation rests on backing winners like Google, Stripe, and Nvidia, you don't easily cast aside a rainmaker. Meanwhile, Maguire's behavior suggests he's not backing down. After issuing a 30-minute video on X last weekend in which he apologized for offending so many — saying he was making a point about a political ideology and not one about a religion — he has doubled down with increasingly aggressive posts this week. He has claimed he has 'reverse engineered' his critics' 'command structure' and threatened to 'embarrass' anyone who escalates against him. He added that this is him at '1% throttle' and warned people not to 'fuck w children of the internet.' The silent treatment Sequoia has precedent for its approach to this situation. The firm has historically given its partners space to express themselves publicly, with figures like Doug Leone and Michael Moritz (who left the firm in 2023) representing different political perspectives. But there's a crucial difference between political diversity and inflammatory rhetoric and clearly to some, Maguire's comments extend beyond partisan politics into territory that alienates both political opponents and potential business partners. It's also worth remembering that even for Sequoia, there is a bright line. Michael Goguen, another, earlier rainmaker with the firm, was promptly shown the door when Sequoia learned of a sexual abuse lawsuit filed against him. The situations are hardly comparable; Goguen's issues were legal and personal, not ideological. At the same time, Sequoia has shown it isn't willing to circle the wagons at any cost, not if its reputation is at stake. Presumably, several factors inform Sequoia's do-nothing PR strategy, including how quickly people, faced with a constant flurry of news, move on from a scandal. The firm is also operating in a different political landscape right now in the U.S. Along with Donald Trump's victory and the rollback of DEI initiatives has come new tolerance for controversial speech. What might have been career-ending at an earlier point in time is now weathered more easily. The firm is also likely banking on the fact that while founders want partners who fit the traditional, more genteel VC mold, they want successful ones even more. Startups being courted by multiple top-tier firms might not like or agree with Maguire, but when Sequoia comes calling with its track record and almost bottomless pockets, most founders are going to welcome the firm with open arms. There's also the very real possibility that Sequoia is working on a contingency plan. (Sequoia declined to comment on Maguire's posts when reached by TechCrunch earlier this week.) Still, Sequoia's silence carries risks. Not all the signers have been confirmed, but the petition against Maguire includes the names of some prominent Middle Eastern executives and founders who have attested to signing it, and they represent the kind of diverse, global talent pool that drives innovation. By not addressing the controversy, Sequoia risks being seen as tacitly endorsing Maguire's views. Put another way, though the venture capital world has historically been remarkably forgiving of controversial figures with exceptional deal flow, the firm is gambling with its reputation in an increasingly connected global market where alienating entire regions and communities carries real business consequences. Whether that bet pays off will depend on how long the controversy lingers, how much business it actually costs Sequoia, and whether Maguire can resist the urge to push things past Sequoia's own tolerance threshold. (He has said he doesn't post anything that hasn't been 'excrutiatingly thought out.') History suggests that established financial firms with strong track records tend to outlive their scandals, even serious ones. When Apollo Global Management's Leon Black resigned in 2021 over his $158 million payments to Jeffrey Epstein, the firm's stock barely moved and shareholders seemed largely unfazed. Apollo just continued its aggressive deal-making under new leadership. Similarly, Kleiner Perkins survived Ellen Pao's high-profile gender discrimination lawsuit in 2015. But it took years and essentially an entirely new team for the storied venture firm to regain its footing in Silicon Valley's hierarchy. The lesson here may be that while controversial partners can be endured, the recovery timelines can vary significantly depending on how firms handle the crisis. For now, the crisis communications professional, who asked not to be named, has some advice for Maguire and, by extension, Sequoia. Regarding the video Maguire published in the aftermath of his initial comments, the expert said, 'I did think that apology addressed the ambiguities in [Maguire's] post. But it's a 30-minute video — you have to be really interested to watch this.' If there's a next time, the professional said, Maguire should 'do two videos — one for three minutes' and another, longer video, for anyone who wants to keep watching. Sometimes, the expert added, 'less is more.'

Ethereum's ETH Surges to $3K as ETF Flows, Tokenization Narrative Fuels Rally

Yahoo

an hour ago

Yahoo

Ethereum's ETH Surges to $3K as ETF Flows, Tokenization Narrative Fuels Rally

Ethereum's ether (ETH) surged on Thursday to its strongest price in more than four months as bitcoin (BTC) broke new record highs. The second-largest cryptocurrency by market capitalization advanced to just shy of $3,000, gaining 6.7% through the past 24 hours. While ETH has significantly underperformed during this cycle and failed to reach new record levels unlike BTC or Solana (SOL), the narrative has recently started to shift around the token. "ETH has taken the lead in price momentum, rallying off recent lows amid a pickup in derivatives activity and growing enthusiasm around its broader role in settlement and tokenization infrastructure," said Joel Kruger, market strategist at LMAX Group. U.S.-listed spot ETFs also saw strong demand, booking over $500 million in inflows month-to-date. Meanwhile, the corporate crypto treasury strategy has expanded beyond bitcoin to ETH as well, with public firms such as Sharplink Gaming and Bitmine Immersion Technology adding the asset to their balance sheet. "In less than one month, public companies will have bought enough ETH to offset all the ETH that's been created since the merge," prominent crypto investor Pentoshi said in an X post earlier this week. "It's 1/9th the market cap of BTC, and takes far less capital to move. That capital is clearly coming."CoinDesk's market analytics model shows that ETH had an explosive rally in the 60-minute interval between 20:58 UTC to 21:57, jumping 6% from $2,819.07 to $2,996.85. The rally occurred in three phases: preliminary consolidation around $2,824 through 21:15, succeeded by an acceleration phase breaking through resistance thresholds at $2,845, $2,870, and $2,920, and culminating in a final advance from to $2,993. Fundamental support levels were established at $2,756.18 and $2,761.11 throughout the trading session. Robust high-volume resistance consolidated around the $2,993.34 threshold.

AGI Likely To Inherit Blackmailing And Extortion Skills That Today's AI Already Showcases

Hashtags

Try Our AI Features

Comments

Related Articles

Global Investors Take Notice of Opportunities in Africa

Sequoia bets on silence

Ethereum's ETH Surges to $3K as ETF Flows, Tokenization Narrative Fuels Rally

Get Started Now: Download the App