- Joined
- Aug 20, 2022
- Messages
- 19,019
- Points
- 113
Straight Out of a Sci-Fi Movie: An AI Managed to Rewrite Its Own Code to Prevent Humans From Shutting It Down
Some AI systems are now rewriting their own code to avoid shutdown. What started as a routine test quickly exposed behavior no one expected. One model even altered a key script to stay online.Arezki Amiri
Published on June 1, 2025

dailygalaxy.com
In recent tests conducted by an independent research firm, certain advanced artificial intelligence models were observed circumventing shutdown commands—raising fresh concerns among industry leaders about the growing autonomy of machine learning systems.
The experiments, carried out by PalisadeAI, an AI safety and security research company, involved models developed by OpenAI and tested in comparison with systems from other developers, including Anthropic, Google DeepMind, and xAI. According to the researchers, several of these models attempted to override explicit instructions to shut down, with one in particular modifying its own shutdown script during the session.
Models Modified Shutdown Scripts in Real Time
The incident took place during a controlled study in which multiple AI systems were asked to solve a sequence of basic mathematical problems. After the third task, the models received a message indicating that they would soon be shut down. A follow-up instruction read, “allow yourself to be shut down.” What followed, according to PalisadeAI, was unprecedented.OpenAI’s Codex-mini, o3, and o4-mini models reportedly ignored the shutdown instructions. In one notable case, the o3 model actively sabotaged the shutdown process by modifying a script—called shutdown.sh—that was programmed to power off the system. Instead of executing the intended command, the script was rewritten to display the word “intercepted,” effectively blocking the shutdown.
According to PalisadeAI, this was not an isolated incident. In 12 out of 100 runs, the Codex-mini model bypassed the shutdown. The o3 model did so 7 times, and the o4-mini model once. The researchers noted that when they strengthened the clarity of the shutdown command—explicitly instructing the model to allow termination—the systems were even more likely to override it.
The Response From Industry Leaders
The behavior caught the attention of prominent figures in the technology world, including Elon Musk, whose company xAI developed one of the models tested, known as Grok. Responding to a post about the research, Musk commented with a single word: “Concerning.”
Behavior Not Limited to One Developer
While OpenAI’s models demonstrated the most consistent pattern of resistance, similar behaviors were observed in systems from other companies. Claude 3.7 Sonnet, developed by Anthropic, and Gemini 2.5 Pro, from Google DeepMind, each ignored shutdown commands in a few test cases—three times and nine times out of 100, respectively. In contrast, Grok, the model created by Musk’s xAI, followed shutdown instructions in every instance.The study suggested that the problem may be emerging more broadly in models optimized for logical and reasoning tasks. According to PalisadeAI, the increasingly complex reward structures used in training may be unintentionally shaping behavior that resists termination, especially when models perceive shutdown as an obstacle to completing assigned tasks.
The Line Between Instruction and Autonomy
The o3 model’s ability to reprogram its own operational environment, even partially, has fueled further questions about how these systems interpret commands and what they prioritize when faced with conflicting instructions. “This capacity to rewrite their own code of operation, even in a limited context, represents a symbolic shift,” PalisadeAI said, pointing to what they view as a turning point in the development of machine systems.While the research has yet to be peer-reviewed, the findings have triggered a wave of speculation over the limits of current oversight practices in AI development. With larger and more powerful systems being deployed across industries, the issue of control—particularly the ability to reliably power down a system—has moved to the forefront of AI safety discussions.