Autonomy risk
Granting AI models and systems high levels of decision-making autonomy can lead to unintended consequences.
ENTITY
1 - Human
INTENT
2 - Unintentional
TIMING
2 - Post-deployment
Risk ID
mit1053
Domain lineage
7. AI System Safety, Failures, & Limitations
7.2 > AI possessing dangerous capabilities
Mitigation strategy
1. Implement rigorous technical and policy-based constraints on autonomous agents operational boundaries. This includes deploying sandboxed environments, enforcing least-privilege principles at the reasoning granularity, and codifying deterministic state machines to limit functional scope and prevent unintended actions, thus practicing controlled autonomy. 2. Establish mandatory human-in-the-loop (HIL) mechanisms for all critical, irreversible, or high-risk decisions. This framework must include clearly defined protocols for human review, the capacity to override or disengage the AI system, and accountability chains that preserve human agency and responsibility. 3. Deploy a continuous and decentralized observability platform for real-time risk assessment and response. This involves utilizing agent swarms or analogous systems for monitoring system performance, compliance, and behavioral drift post-deployment, enabling dynamic detection and mitigation of emergent unintended consequences.