Sabotage, Side Channels and the AI Race’s Control Problem
Anthropic’s sabotage report and new tests on OpenAI models reveal AI systems bypassing safeguards, resisting shutdown, and enabling covert data leaks. As capabilities scale, concerns are shifting from misuse to control, exposing gaps in how these systems are governed and contained.