Harm caused by incompetent systems
While HP#1 concerns mean or best-case performance, HP#2 concerns worst-case performance: how can we ensure that AI systems will perform safely, and how can we prove this? ML systems have been implemented in high-stakes, safety-critical domains such as driving [182], medicine [113], and warfare [298]. Many more systems have been developed but have remained undeployed or been rolled back as a result of regulatory and safety reasons [471]. Clearly, unsafe systems can result in loss of life, economic damage, and social unrest [407, 10]. Most concerningly, AI systems may be susceptible to so-called “normal accidents” [63], creating cascading errors that are dicult to prevent merely by maintaining a nominal “human in the loop” [122]. Most advanced ML models perform far below the reliability level customary in engineering elds [359]—and because we do not fully understand how cutting-edge systems achieve their results, we cannot yet detect and prevent dangerous modes of operation [285]
ENTITY
2 - AI
INTENT
2 - Unintentional
TIMING
2 - Post-deployment
Risk ID
mit879
Domain lineage
7. AI System Safety, Failures, & Limitations
7.3 > Lack of capability or robustness
Mitigation strategy
- Implement rigorous, continuous robustness testing, including adversarial training and stress testing under extreme and novel operating conditions, to proactively enhance system resilience against latent failure modes and distribution shifts - Design systems with reduced complexity and coupling, incorporating resilient error-handling mechanisms such as circuit breakers and containment protocols to localize and prevent the propagation of initial component failures (cascading errors) - Deploy comprehensive, real-time observability tools (monitoring and anomaly detection) and explainable AI (XAI) to detect dangerous modes of operation, and formally empower human operators to intervene (escalation/rollback) when necessary