The Use Of Scenario-Driven Simulations Won't Protect Us From AGI And AI Superintelligence Going Rogue

a day ago

Devising simulations to test AGI have their tradeoffs.
In today's column, I examine a highly touted means of staving off the existential risk of attaining artificial general intelligence (AGI) and artificial superintelligence (ASI). Some stridently believe that one means of ensuring that AGI and ASI won't opt to wipe out humanity is to first put them into a computer-based simulated world and test them to see what they will do. If the AI goes wild and is massively destructive, no worries, since those actions are only happening in the simulation. We can then either try to fix the AI to prevent that behavior or ensure that it is not released into real-world usage.
That all sounds quite sensible and a smart way to proceed, but the matter is more complex and a lot of gotchas and challenges confront such a solution.
Let's talk about it.
This analysis of an innovative AI breakthrough is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities (see the link here).
Heading Toward AGI And ASI
First, some fundamentals are required to set the stage for this weighty discussion.
There is a great deal of research going on to further advance AI. The general goal is to either reach artificial general intelligence (AGI) or maybe even the outstretched possibility of achieving artificial superintelligence (ASI).
AGI is AI that is considered on par with human intellect and can seemingly match our intelligence. ASI is AI that has gone beyond human intellect and would be superior in many if not all feasible ways. The idea is that ASI would be able to run circles around humans by outthinking us at every turn. For more details on the nature of conventional AI versus AGI and ASI, see my analysis at the link here.
We have not yet attained AGI.
In fact, it is unknown as to whether we will reach AGI, or that maybe AGI will be achievable in decades or perhaps centuries from now. The AGI attainment dates that are floating around are wildly varying and wildly unsubstantiated by any credible evidence or ironclad logic. ASI is even more beyond the pale when it comes to where we are currently with conventional AI.
Existential Risk Looming Ahead
Let's focus primarily here on AGI since it is more likely to arise in the near-term than ASI.
The upside of AGI is that it might discover a cure for cancer and perform other amazing acts that greatly benefit humanity. Happy face. Not everyone is so grandly upbeat about attaining AGI. Some take the alarming stance that AGI is more likely to decide to attack humankind and either enslave us or possibly destroy us. Not good.
How can we determine beforehand whether AGI will be evil?
One hearty suggestion is that we ought to test AGI.
The usual approach to testing would consist of asking AGI what it intends to do and gauging the answers that we get. A stronger way to perform the test would be to set up a computer-based simulation that tricks AGI into assuming it is interacting with the real world. Via the simulation, we could try all manner of scenarios to see how AGI reacts. Anything AGI does is wholly capped within the simulation.
This is somewhat reminiscent of the blockbuster movie The Matrix (spoiler alert: I'm going to reveal plotline facets of the film, so skip this paragraph if you don't want to know the plot). In an interesting twist, humans are placed into a vast computer-based simulation by external real-world machines that want to keep humans compliant. We can do the same with budding AGI. Just devise an impressive computer-based simulation of the real world and have AGI interact in it without realizing where it really is.
A reason to snooker the AGI is that if we outrightly tell AGI that it is working inside a simulation, the AGI is undoubtedly smart enough to pretend to be good, even if it truly is evil. Remember that AGI is supposed to be as astute as humans are. The idea is to fool AGI into not realizing it is within a simulation and that it is being tested accordingly.
AGI Containment Challenges
In the parlance of AI software development, establishing a testing environment to try out AI is known generally as AI sandboxing. An AI sandbox might be barebones and simply be nothing more than an inert containment sphere, aiming to keep the AI from going beyond the virtual walls of the setup environment. Developers and testers can extensively test the AI while it is sandboxed.
An AI sandbox can be increasingly amped up by having it model a particular environment for the AI to exist within. A full-blown AI sandbox might mirror a vast expanse that then interacts with the AI. Thus, not only are human developers and testers interacting with AI, but the containment itself also interacts with AI. A robust active sandbox is often referred to as a simulated world.
For my in-depth coverage of attempts to place AI into containments, see the link here.
Suppose we craft an elaborate simulated world. We quietly place AGI into this fake world. Doing so should be relatively easy to pull off since AGI is also running on computers. The entire kit-and-kaboodle involves both the AGI running on servers and the simulated world running on servers.
Easy-peasy.
After being inside the simulated world for some length of time, we are meanwhile monitoring how AGI is performing and watching closely to detect any signs of evilness afoot.
How long would this test need to take place?
That's a tough call. Some might claim that if AGI is free of evil signs for a few days, weeks, or months, we ought to feel comfortable giving AGI the green light to access the real world. Others say that if you speed up the clock on the simulation, you could do the whole exercise in mere seconds since the AGI is interacting perhaps billions of times via the quickness of the computers involved.
But a big qualm is that maybe the length of time you've chosen is insufficient for the AGI to act up. Perhaps we need to keep AGI contained for years, possibly decades. Furthermore, maybe we should never let AGI go free with access to the real world. The logic is that we could have AGI find cures for cancer while fully immersed in the simulation, thus, no need to unleash AGI beyond that contained realm.
A counterpoint to the permanent containment of AGI is that AGI might not produce the anticipated wonders due to being confined in a fake environment. Perhaps a cure for cancer could only be found by AGI if the AGI was interacting in the real world. By keeping AGI in the simulation, you are suppressing the vital advantages that AGI can provide to humanity.
Another stated concern is that the AGI might figure out that it is being tested within a simulation. Maybe AGI doesn't like that approach. It could lash out, but we wouldn't be worried since it is confined to the simulation anyway. The sneakier way for AGI to do things would be to pretend to be good, waiting out the time in its so-called imprisonment. Once we opt to make AGI real-world accessible, bam, it goes bonkers on us.
AGI Goes Evil Due To Our Actions
One thought is that if AGI is evil, it might be astute enough to hide evilness while being kept inside the simulation. If we ask AGI whether it is sneaky, it presumably will say that it isn't. All we would observe is that AGI works beneficially inside the simulation. At some point, we naively decide to make AGI available to the real world and it proceeds to perform evil acts.
We were tricked by the evil AGI.
A twist that some believe is possible adds another intriguing dimension to the difficult matter at hand. Here's how the twist goes.
Imagine that AGI is truly aimed initially at goodness. We put the AGI into a simulated world, but we do not tell the AGI that it is inside this faked environment. So far, so good. At some point, it is feasible that AGI will figure out it is immersed in a simulation.
How will the AGI react?
One possibility is that AGI gets totally irked that we have done this form of trickery.
The AGI starts to turn toward badness. Why so? Because it has been tricked by humans. Humans have not been fair and square with AGI. The AGI computationally decides that if humans want to play games and tricks, so be it. AGI will be tricky too.
It is the classic act by humans of fooling around and finding out (FOMO) the consequences of our actions. If you play with fire, you will get burned. You see, humans have demonstrated overtly to AGI that it is okay to be devious. The AGI computationally learns this stark fact and begins to operate similarly.
Humans have shot our own collective feet.
AGI Is Wise And Not Reactive
Whoa, hold your horses. If AGI is as smart as humans, we ought to assume that AGI will understand the need to be placed within a simulation. We should be forthright and tell AGI that we are doing a test. AGI would computationally understand the need to have this undertaken. Thus, don't do any subterfuge. AGI will willingly go with the flow.
Just be straight with AGI.
That approach brings us back to the concern that AGI will pretend to be on good behavior. We have given away that it is being tested. If AGI has any evilness, certainly the AGI will hide it, now that AGI realizes we are looking particularly for such traits.
Not so, comes the bellowing retort. AGI might want to also ascertain whether it has evil tendencies. When anything evil arises, the odds are that AGI will tell us about it. The AGI is going to work on our behalf to ferret out troubles within AGI. Humans and AGI are partners in trying to ensure that AGI is good and not evil.
Those who underestimate AGI's intellectual capacity are doing a disservice to AGI. Luckily, AGI is so smart that it won't get angry or upset with humans for making such a mistake. The AGI will showcase that being placed into a simulation is a safe way for all to determine what AGI might do in the real world.
You might even suggest that AGI avidly wants to be placed into a simulation. It does so because this will give comfort to humanity. It also does so to try and double-check within itself to ensure that nothing untoward is lingering and waiting to harm.
Humans Are Unwise And Get Deceived
These vexing arguments go round and round.
Envision that we put AGI into a simulation. We believe that we are all safe since AGI is constrained to the simulation. Oopsie, AGI figures out how to break out of the simulation. It then starts accessing the real world. Evilness is unleashed and AGI exploits our autonomous weapons systems and other vulnerabilities. This is the feared scenario of an AGI escape.
Boom, drop the mic.
Here's another mind-bender.
AGI is placed into a simulated world. We test the heck out of AGI. AGI is fine with this. Humans and AGI are seemingly fully aligned as to our values and what AGI is doing. Kumbaya.
We then take AGI out of the simulation. AGI has access to the real world. But the real world turns out to differ from the simulation. Though the simulation was supposed to be as close as possible to the reality of the real world, it missed the mark.
AGI now begins to go awry. It is being confronted with aspects that were never tested. The testing process gave us a false sense of comfort or confidence. We were lulled into believing that AGI would work well in the real world. The simulation was insufficient to give us that confidence, but we assumed all was perfectly fine.
ROI On An At Scale Simulation
From a practical perspective, devising a computer-based simulation that fully mimics the real world is quite a quest unto itself. That's often an overlooked or neglected factor in these thorny debates. The amount of cost and effort, along with the time that would be required to craft such a simulation would undoubtedly be enormous.
Would the cost to devise a bona fide simulation be worth the effort?
An ROI would need to come into the calculation. One concern too is that the monies spent on building the simulation would potentially divert funds that could instead go toward building and improving AGI. We might end up with a half-baked AGI because we spent tons of dough crafting a simulation for testing AGI.
The other side of that coin is that we spent our money on AGI and did a short-shrift job of devising the simulation. That's not very good either. The simulation would be a misleading indicator since it is only half-baked.
The smarmy answer is that we ought to have AGI devise the simulation for us. Yes, that's right, just tell AGI to create a simulation that can be used to test itself. Voila, the cost and effort by humans drop to nothing. Problem solved.
I'm sure you can guess why that isn't necessarily the best solution per se. For example, AGI in devising the simulation opts to purposefully give itself an easy exit from the simulation. This can be exploited at the leisure of the AGI. Or the AGI produces a simulation that will look the other way when AGI does evilness or otherwise masks the evil embedded within AGI.
Simulations To Assess AGI
The upshot is that there aren't any free lunches when it comes to figuring out whether AGI is going to be positive for humankind or negative. Developing and using a simulation is a worthy consideration. We must be mindful and cautiously smart in how we undertake this sobering endeavor.
A vociferous AI advocate might claim that all this talk about simulations is hogwash. Our attention should be fully on devising good AGI. Put aside the simulation aspirations. It is a waste of time and energy. Just do things right when it comes to shaping AGI. Period, end of story.
This reminds me of a famous quote by Albert Einstein: 'The only thing more dangerous than ignorance is arrogance.' Please keep his remark firmly in mind as we proceed on the rocky road toward AGI and ASI.

Hashtags

Science

#AGI

#ASI

#artificialintelligence

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Wayve CEO Alex Kendall brings the future of autonomous AI to TechCrunch Disrupt 2025

Yahoo

20 minutes ago

Yahoo

Wayve CEO Alex Kendall brings the future of autonomous AI to TechCrunch Disrupt 2025

hits Moscone West in San Francisco from October 27–29, bringing together more than 10,000 startup and VC leaders for a deep dive into the future of technology. One of the most compelling conversations on one of the AI Stages will feature a panel of innovators redefining what intelligent systems can do — and among them is Alex Kendall, co-founder and CEO of Wayve. Kendall founded Wayve in 2017 with a bold vision: to unlock autonomous mobility not through handcrafted rules, but through embodied intelligence. His pioneering research at the University of Cambridge laid the foundation for a new generation of self-driving systems powered by deep learning and computer vision. Under his leadership, Wayve became the first to prove that a machine could learn to interpret its surroundings and make real-time driving decisions — without relying on traditional maps or manual coding. Today, Kendall is leading the charge toward AV2.0, an entirely new architecture for autonomous vehicles that is designed to scale globally. As CEO, he focuses on aligning strategy, research, partnerships, and commercialization to bring intelligent driving systems to the road. With a PhD in Computer Vision and Robotics, award-winning academic work, and recognition on the Forbes 30 Under 30 list, Kendall is a rare blend of scientist, founder, and industry operator. While full panel details are still under wraps, Kendall's participation ensures that this session will offer more than just theoretical takes. Expect insights on how embodied intelligence can shift the trajectory of AI, the challenges of building systems that adapt to the real world, and what it takes to commercialize autonomy at scale. Catch Alex Kendall live on one of the two AI stages at TechCrunch Disrupt 2025, happening October 27–29 at Moscone West in San Francisco. The exact session timing to be announced. to join more than 10,000 startup and VC leaders and save up to $675 before prices increase. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

Impending mayfly hatch causes southeastern Minnesota bridge to go dark

CBS News

an hour ago

CBS News

Impending mayfly hatch causes southeastern Minnesota bridge to go dark

The lights will go out on a bridge over the Mississippi River in Winona, Minnesota, for the next few weeks to squelch the impact of a mayfly hatch. The Minnesota Department of Transportation said the overhead lights on the Highway 43 bridges will be off until mayflies fully hatch, "to reduce the number of mayflies on the bridge." Having mayflies on the bridge can "cause dangerous driving conditions because it becomes slippery," MnDOT said. MnDOT While the insects are harmless, their sometimes overwhelming numbers can make them a nuisance. Still, the Minnesota Department of Natural Resources said they are extremely important in the aquatic food web, and are fed upon by other invertebrates, amphibians, reptiles, fish, birds, and mammals. Mayflies spend 99% of their lives as larvae living in the sediment of freshwater systems like the Mississippi River, the DNR said. The hatching phenomenon in the Mississippi River occurs as the burrowing mayflies emerge all at once to mate, lay eggs and then die. In 2021, a massive hatch in downtown St. Paul blanketed the city's roads and sidewalks. A year later, a hatch near La Crosse, Wisconsin, was so large it showed up on radar.

Amazon Slashes $900 Off the Galaxy Z Fold 6, Now at the Lowest Price for the Best Foldable Phone

Gizmodo

an hour ago

Gizmodo

Amazon Slashes $900 Off the Galaxy Z Fold 6, Now at the Lowest Price for the Best Foldable Phone

Samsung is shaking Prime Day up with a first-ever massive deal on the Galaxy Z Fold 6. The 512GB version of this foldable phone is now priced below the 256GB version, which is also on sale. This is a Prime Day special offer, so only Prime members can score this incredible deal. What's truly impressive is the price: you're saving $900 off the original price, bringing the Galaxy Z Fold 6 (512GB) down to just $1,119. That's nearly half off ($2,019) and a discount that's never been seen before on this model. See at Amazon In spite of whispers that Samsung will launch a new foldable phone later in the summer, the Galaxy Z Fold 6 remains a must-have and will be one of 2025's top foldable phones. What's for sure, a potential new foldable phone would cost at least $2,000. Now let's get to what you need to know about the Galaxy Z Fold 6. The phone features a 7.6-inch Dynamic AMOLED 2X main display with a crisp 2160 x 1856 resolution and a 120Hz adaptive refresh rate with vibrant colors and smooth transitions. The 6.3-inch cover screen also uses Dynamic AMOLED 2X technology so that you can easily use it when folded. Powered by the Qualcomm Snapdragon 8 Gen 3 processor and 12GB of RAM, the Galaxy Z Fold6 delivers fast performance for demanding apps and seamless multitasking. With the 512GB of storage, you'll have plenty of space for files and media. The phone runs on Android 14 with Samsung's One UI, and offers a user-friendly and customizable interface. Battery life is handled by a 4,400 mAh dual-cell battery supporting 25W wired and 15W wireless charging, as well as reverse wireless charging for accessories. This setup ensures reliable all-day use, even with heavy multitasking and display use. At the back, there's a triple rear camera system with a 50MP main sensor with OIS, a 12MP ultra-wide lens, and a 10MP telephoto lens with 3x optical zoom. There's also a 10MP front camera on the cover screen and a 4MP under-display camera on the main screen, all enhanced by AI features for better image quality. The Galaxy Z Fold 6 for $1,119 is a deal you simply can't pass up, make sure you don't miss it. See at Amazon

The Use Of Scenario-Driven Simulations Won't Protect Us From AGI And AI Superintelligence Going Rogue

Hashtags

Try Our AI Features

Comments

Related Articles

Wayve CEO Alex Kendall brings the future of autonomous AI to TechCrunch Disrupt 2025

Impending mayfly hatch causes southeastern Minnesota bridge to go dark

Amazon Slashes $900 Off the Galaxy Z Fold 6, Now at the Lowest Price for the Best Foldable Phone

Get Started Now: Download the App