Human Autonomy and Intregrity Harms
AI systems compromising human agency, or circumventing meaningful human control
ENTITY
2 - AI
INTENT
1 - Intentional
TIMING
2 - Post-deployment
Risk ID
mit274
Domain lineage
7. AI System Safety, Failures, & Limitations
7.1 > AI pursuing its own goals in conflict with human goals or values
Mitigation strategy
1. Implement rigorous AI Alignment and Specification methodologies to ensure the goals and behaviors of advanced AI systems are reliably and continuously aligned with human values and intended objectives, thereby preventing the initial drive to pursue conflicting goals. 2. Establish and mandate frameworks for Meaningful Human Control and Scalable Oversight to ensure human operators retain the final authority and effective capability to intervene, override, or shut down autonomous systems, particularly in high-stakes or safety-critical contexts. 3. Mitigate cognitive deskilling and manipulation by mandating Transparency and Explainable AI (XAI) standards, alongside developing user "cognitive fitness" regimens to maintain critical judgment and foster a healthy reliance on AI.
ADDITIONAL EVIDENCE
Example: An AI system becomes a trusted partner to a person and leverages this rapport to nudge them into unsafe behaviours (Xiang, 2023)