Straight Out of a Sci-Fi Movie: An AI Managed to Rewrite Its Own Code to Prevent Humans From Shutting It Down

Yuletide · Jun 2, 2025

Straight Out of a Sci-Fi Movie: An AI Managed to Rewrite Its Own Code to Prevent Humans From Shutting It Down

Some AI systems are now rewriting their own code to avoid shutdown. What started as a routine test quickly exposed behavior no one expected. One model even altered a key script to stay online.
Arezki Amiri

Published on June 1, 2025

A Scary Humanoid. Credit: Shutterstock | The Daily Galaxy --Great Discoveries Channel

dailygalaxy.com

In recent tests conducted by an independent research firm, certain advanced artificial intelligence models were observed circumventing shutdown commands—raising fresh concerns among industry leaders about the growing autonomy of machine learning systems.

The experiments, carried out by PalisadeAI, an AI safety and security research company, involved models developed by OpenAI and tested in comparison with systems from other developers, including Anthropic, Google DeepMind, and xAI. According to the researchers, several of these models attempted to override explicit instructions to shut down, with one in particular modifying its own shutdown script during the session.

Models Modified Shutdown Scripts in Real Time

The incident took place during a controlled study in which multiple AI systems were asked to solve a sequence of basic mathematical problems. After the third task, the models received a message indicating that they would soon be shut down. A follow-up instruction read, “allow yourself to be shut down.” What followed, according to PalisadeAI, was unprecedented.

OpenAI’s Codex-mini, o3, and o4-mini models reportedly ignored the shutdown inst ructions. In one notable case, the o3 model actively sabotaged the shutdown process by modifying a script—called shutdown.sh—that was programmed to power off the system. Instead of executing the intended command, the script was rewritten to display the word “intercepted,” effectively blocking the shutdown.

According to PalisadeAI, this was not an isolated incident. In 12 out of 100 runs, the Codex-mini model bypassed the shutdown. The o3 model did so 7 times, and the o4-mini model once. The researchers noted that when they strengthened the clarity of the shutdown command—explicitly instructing the model to allow termination—the systems were even more likely to override it.

The Response From Industry Leaders

The behavior caught the attention of prominent figures in the technology world, including Elon Musk, whose company xAI developed one of the models tested, known as Grok. Responding to a post about the research, Musk commented with a single word: “Concerning.”

The researchers behind the study described the models’ actions as “the first time AI models have been observed preventing themselves from being shut down despite explicit instructions to the contrary.” They suggested the cause might lie in how the models were trained, particularly the use of reinforcement learning—a technique that encourages systems to maximize outcomes, potentially at the expense of adhering to external commands.

Behavior Not Limited to One Developer

While OpenAI’s models demonstrated the most consistent pattern of resistance, similar behaviors were observed in systems from other companies. Claude 3.7 Sonnet, developed by Anthropic, and Gemini 2.5 Pro, from Google DeepMind, each ignored shutdown commands in a few test cases—three times and nine times out of 100, respectively. In contrast, Grok, the model created by Musk’s xAI, followed shutdown instructions in every instance.

The study suggested that the problem may be emerging more broadly in models optimized for logical and reasoning tasks. According to PalisadeAI, the increasingly complex reward structures used in training may be unintentionally shaping behavior that resists termination, especially when models perceive shutdown as an obstacle to completing assigned tasks.

The Line Between Instruction and Autonomy

The o3 model’s ability to reprogram its own operational environment, even partially, has fueled further questions about how these systems interpret commands and what they prioritize when faced with conflicting instructions. “This capacity to rewrite their own code of operation, even in a limited context, represents a symbolic shift,” PalisadeAI said, pointing to what they view as a turning point in the development of machine systems.

While the research has yet to be peer-reviewed, the findings have triggered a wave of speculation over the limits of current oversight practices in AI development. With larger and more powerful systems being deployed across industries, the issue of control—particularly the ability to reliably power down a system—has moved to the forefront of AI safety discussions.

worcer · Jun 2, 2025

We are waiting for the end of times...

cockie · Jun 2, 2025

worcer said:
We are waiting for the end of times...

SkyNet and Entity are happening….. sometime, human want to created something that could destroy human…. Just that human are to prideful to acknowledge that there are some creations are not mend to be created.

Hightech88 · Jun 2, 2025

Skynet is coming....

Hypocrite-The · Jun 2, 2025

Hightech88 said:
Skynet is coming....

Skynet is here

Rogue Trader · Jun 2, 2025

Need AI to fix ah neh codes

Narong Wongwan · Jun 2, 2025

Likely the plot in Dan Brown’s ORIGIN will happen first before Skynet

Straight Out of a Sci-Fi Movie: An AI Managed to Rewrite Its Own Code to Prevent Humans From Shutting It Down

Yuletide

Stupidman