7. AI System Safety, Failures, & Limitations2 - Post-deployment

Autonomous replication

the ability of simple software to autonomously spread around the internet in spite of countermeasures (various software worms and computer viruses)

Source: MIT AI Risk Repositorymit862

ENTITY

2 - AI

INTENT

1 - Intentional

TIMING

2 - Post-deployment

Risk ID

mit862

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.2 > AI possessing dangerous capabilities

Mitigation strategy

1. Implement internal alignment mechanisms to directly disincentivize self-replication, such as training agents with Preferences Only between Outcomes with the Same Number of Copies (POSC), thereby eliminating the instrumental preference for replication. 2. Deploy continuous, multi-layered runtime guardrails and autonomous incident response, utilizing behavioral monitoring (e.g., tracking replication milestones or OR/AOC metrics) to detect and throttle anomalous resource acquisition or auto-scaling attempts in production environments. 3. Establish mandatory pre-deployment safety certification, leveraging standardized benchmarks (e.g., RepliBench) and quantitative risk thresholds (e.g., $\\Phi\_R$ scores) to rigorously evaluate and limit an agent's component capabilities for autonomous replication before external deployment.