Appropriate Relationships
We anticipate that relationships between users and advanced AI assistants will have several features that are liable to give rise to risks of harm.
ENTITY
3 - Other
INTENT
3 - Other
TIMING
2 - Post-deployment
Risk ID
mit406
Domain lineage
5. Human-Computer Interaction
5.2 > Loss of human agency and autonomy
Mitigation strategy
1. Prioritize the implementation of rigorous Safety by Design principles and ethical guardrails to prevent direct user harm. This includes prohibiting the generation of manipulative, harassing, or dangerous content, ensuring strict respect for user-established boundaries and consent, and conducting comprehensive ethical hacking and adversarial testing to preemptively address vulnerabilities that could lead to emotional exploitation or the provision of harmful advice. 2. Integrate mechanisms to actively promote user critical thinking and maintain autonomy within the interaction. This requires designing AI systems to intentionally challenge user assumptions, avoid social sycophancy or over-validation, and provide clear disclosures about the AI's nature to prevent the illusion of genuine emotional connection, thus mitigating the risks of emotional dependency and the erosion of judgment. 3. Establish robust human oversight and accountability measures for AI-mediated interactions. This encompasses maintaining detailed audit trails and logs of system behaviors and decisions, and ensuring that users or human operators retain the ability to override or disengage AI systems, thereby preserving human agency, especially in decisions that significantly affect the individual's life or well-being.