Autonomous replication
the ability of simple software to autonomously spread around the internet in spite of countermeasures (various software worms and computer viruses)
ENTITY
2 - AI
INTENT
1 - Intentional
TIMING
2 - Post-deployment
Risk ID
mit862
Domain lineage
7. AI System Safety, Failures, & Limitations
7.2 > AI possessing dangerous capabilities
Mitigation strategy
1. Implement internal alignment mechanisms to directly disincentivize self-replication, such as training agents with Preferences Only between Outcomes with the Same Number of Copies (POSC), thereby eliminating the instrumental preference for replication. 2. Deploy continuous, multi-layered runtime guardrails and autonomous incident response, utilizing behavioral monitoring (e.g., tracking replication milestones or OR/AOC metrics) to detect and throttle anomalous resource acquisition or auto-scaling attempts in production environments. 3. Establish mandatory pre-deployment safety certification, leveraging standardized benchmarks (e.g., RepliBench) and quantitative risk thresholds (e.g., $\\Phi\_R$ scores) to rigorously evaluate and limit an agent's component capabilities for autonomous replication before external deployment.