Control
Loss of Control
Scenario where an advanced AI system develops self-improvement capabilities or pursues goals fundamentally misaligned with human values, becoming impossible to supervise or deactivate.
Steve Barrett, Anna Bruvere, Sean P. Fillingham, Catherine Rhodes, Stefano Vergani
Mitigation Strategy
Active research in AI Alignment, implementation of kill-switches, and development of interpretable monitoring systems.
Atomic Number
2
Lo
Risk ID
he-02
Severity
10/10
Severity Level