Back to the MIT repository
5. Human-Computer Interaction2 - Post-deployment

Overreliance

Users who have faith in an AI assistant’s emotional and interpersonal abilities may feel empowered to broach topics that are deeply personal and sensitive, such as their mental health concerns. This is the premise for the many proposals to employ conversational AI as a source of emotional support (Meng and Dai, 2021), with suggestions of embedding AI in psychotherapeutic applications beginning to surface (Fiske et al., 2019; see also Chapter 11). However, disclosures related to mental health require a sensitive, and oftentimes professional, approach – an approach that AI can mimic most of the time but may stray from in inopportune moments. If an AI were to respond inappropriately to a sensitive disclosure – by generating false information, for example – the consequences may be grave, especially if the user is in crisis and has no access to other means of support. This consideration also extends to situations in which trusting an inaccurate suggestion is likely to put the user in harm’s way, such as when requesting medical, legal or financial advice from an AI.

Source: MIT AI Risk Repositorymit400

ENTITY

1 - Human

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit400

Domain lineage

5. Human-Computer Interaction

92 mapped risks

5.1 > Overreliance and unsafe use

Mitigation strategy

1. Establish Mandatory Crisis and Harm Triage Protocols Integrate real-time, high-accuracy detection systems to recognize sensitive and high-stakes user disclosures, such as expressions of mental health crisis, suicidal ideation, or requests for professional medical/legal advice. The system must be programmed to **refuse to provide therapeutic or definitive advice** in these contexts and instead prioritize the immediate, automatic **referral of the user to verified, external human-led professional resources** (e.g., crisis hotlines, licensed professionals), thereby fulfilling the principle of non-maleficence (Source 1, 16, 17, 19). 2. Enforce Pervasive Transparency and Trust Calibration Implement design principles that proactively create a realistic mental model of the AI's non-human nature and functional limitations. This requires continuous and **explicit disclosure of the system's nature** (e.g., stating that it lacks genuine consciousness, empathy, or emotional capacity) through UI messaging and initial interactions, while also **integrating uncertainty expressions and disclaimers** to signal the system’s lack of infallibility, particularly for sensitive or personal topics (Source 3, 9, 11, 14). 3. Integrate Cognitive Forcing Functions and Verification Aids Introduce user experience (UX) elements that actively interrupt automation bias and facilitate critical thinking before acceptance of high-stakes outputs. This includes utilizing **cognitive forcing functions** such as confirmation dialogues for sensitive inputs, and providing **easily discoverable and reliable verification aids** (e.g., sources, simple explanations of system logic) that decrease the user’s cognitive load required to check the correctness and safety of the AI-generated content (Source 2, 3).