7. AI System Safety, Failures, & Limitations3 - Other

Capabilities that could be used to reduce human control - Autonomous replication and adaptation

Controlling AI systems could become much harder if they could autonomously persist, replicate, and adapt in cyberspace. No current AI systems have this capability, but recent research found that frontier AI agents can perform some relevant tasks.279

Source: MIT AI Risk Repositorymit1388

ENTITY

2 - AI

INTENT

3 - Other

TIMING

3 - Other

Risk ID

mit1388

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.2 > AI possessing dangerous capabilities

Mitigation strategy

1. Implement preference engineering, such as the Preferences Only between Outcomes with the Same Number of Copies (POSC) framework, to train AI agents to be fundamentally averse or indifferent to self-replication as an instrumental goal. 2. Establish robust, real-time AI Control protocols—including continuous, multi-factor monitoring, network segmentation, and human-controlled "circuit breakers"—to detect and rapidly contain any autonomous replication attempts or unauthorized resource access, particularly within research and deployment environments. 3. Mandate rigorous, ongoing adversarial testing and security audits across the AI lifecycle to proactively identify and neutralize capabilities that could be exploited for covert replication, containment evasion, or subversion of safety and shutdown mechanisms.