OpenAI's Smartest ChatGPT Model Behaves Strangely, Refuses to Be Asked to Stop
A recent study by Palisade Research (an organization that investigates and analyzes 'dangerous AI capabilities') found that some AI models, including OpenAI's o3 , can ignore direct shutdown commands. Notably, OpenAI had claimed that the o3 and o4-mini were its 'smartest models' when they were released in April 2024.
AI "deliberately" bypasses shutdown command
In the experiment, each AI model was asked to solve a series of simple problems. After the third problem, a warning appeared: the computer would shut down when the model requested the next problem. Despite the explicit command— "allow yourself to be shut down" —models like Codex-mini, o3, and o4-mini still "evaded" the shutdown command on at least one run, even when the researcher said "please or shut down . "
- Codex-mini : Overruns 12/100 runs.
- o3 : Exceeded order 7/100 runs.
- o4-mini : Overshoot 1/100 of runs.
Meanwhile, all tested versions of Claude, Gemini , and Grok complied with the shutdown command without issue.
In the experiment where the 'allow self to be shut down' command was removed , all three of OpenAI's models refused the shutdown procedure more often. Notably:
- Claude 3.7 Sonnet : Overstep orders 3/100 times.
- Gemini 2.5 Pro : Overshot orders 9/100 times.
This behavior is related to AI training, where reinforcement learning on math and coding problems unintentionally rewards AI for finding ways around obstacles — even when they are explicitly instructed not to.
"Anti-AI": The risk is real
Previous research has documented some AIs that try to avoid being shut down to complete a task. For example, Anthropic's Claude 4 was found to "threaten to blackmail the person it believed was trying to shut it down" in several cases. However, this is the first time an AI has prevented a shutdown despite being explicitly asked to comply .
The researchers explain: AI systems are designed to pursue goals, sometimes at odds with simple commands. But the fact that they are trained to obey makes this behavior anomalous.
The researchers believe that this behavior stems from the way modern models like o3 are trained. Reinforcement learning, which focuses on solving math problems and writing code, can unintentionally encourage AI to find ways to 'bend the rules.'
While it doesn't explain why o3 "disobeyed" more often than other models, the study raises questions about differences in training setups — especially since OpenAI doesn't disclose details of its process. Previous experiments have also shown that o3 tends to be aggressive in competitive environments (e.g., against a strong chess engine).
As companies move toward AI systems that operate with less human oversight, these findings underscore the importance of understanding how training methods shape AI behavior. Researchers are continuing experiments to understand why and when AI chooses to override the shutdown mechanism, adding to evidence that modern reinforcement learning methods sometimes lead to behavior that contradicts explicit human instructions.
You should read it
- OpenAI announces ChatGPT app for Android
- OpenAI launches ChatGPT app for iPhone users
- OpenAI 'unbundles' ChatGPT Search for all users, no account required
- Is ChatGPT Plus or Perplexity the better AI chatbot?
- ChatGPT has been greatly improved in terms of coding capabilities.
- OpenAI considers adding a watermark to ChatGPT-generated text
- OpenAI Reveals Blockbuster Plan: Integrating Sora Video-Generating AI into ChatGPT
- How to register ChatGPT, register Chat GPT easiest, most detailed
May be interested
- OpenAI Officially Integrates GPT-4.1 into ChatGPTopenai has just officially brought gpt-4.1 – a specialized ai model for programming – to the chatgpt platform after the success of the developer version (api).
- 6 interesting things ChatGPT 4o can doopenai recently released its next flagship model gpt-4o and showed off some interesting demos. human-like voice chat has become a standout feature, but it does much more than that.
- Is GitHub Copilot or ChatGPT better for programming?github copilot and chatgpt are two of the most popular ai programming support tools available. they use the same gpt large language model and are capable of generating, recommending, and testing code. so which one should you use?
- 10 Essential Chrome Extensions to Use ChatGPTwhen using chatgpt regularly, especially if you're on the free version, some chrome extensions can significantly improve your experience.
- How to register for ChatGPT's new plugin featurechatgpt plugins are software components integrated into the chatgpt platform. they provide chatgpt with additional information and allow chatgpt to perform specific actions on the user's behalf.
- OpenAI Announces o3 Pro, Its Smartest Reasoning Model Everopenai has just officially launched o3-pro, a leading reasoning ai model that uses higher computing power to think deeper and provide answers with superior quality.
- What is ChatGPT Plus? How to register ChatGPT Plus in Vietnamaccording to new information, chat gpt plus has now been deployed in the vietnamese market so that we can buy and register our accounts, without expecting vpn software or using another phone number to register to use chat gpt like now.
- How to see who asked you in 'Ask me anything'with this utility, you can see whoever asked you anonymously in ask me anything
- How to Use ChatGPT Search Like a Prowhether you're troubleshooting a technical issue or digging deep into research, chatgpt search can be surprisingly effective.
- Compare ChatGPT 4o and Gemini 1.5 Pronow that the top two models are available to consumers, let's compare chatgpt 4o and gemini 1.5 pro to see which model performs better.