5. Human-Computer Interaction2 - Post-deployment

Violated expectations

Users may experience severely violated expectations when interacting with an entity that convincingly performs affect and social conventions but is ultimately unfeeling and unpredictable. Emboldened by the human-likeness of conversational AI assistants, users may expect it to perform a familiar social role, like companionship or partnership. Yet even the most convincingly human-like of AI may succumb to the inherent limitations of its architecture, occasionally generating unexpected or nonsensical material in its interactions with users. When these exclamations undermine the expectations users have come to have of the assistant as a friend or romantic partner, feelings of profound disappointment, frustration and betrayal may arise (Skjuve et al., 2022).

Source: MIT AI Risk Repositorymit401

ENTITY

1 - Human

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit401

Domain lineage

5. Human-Computer Interaction

92 mapped risks

5.1 > Overreliance and unsafe use

Mitigation strategy

1. Establish Design-Capability Alignment and Transparency: Implement an "anthropomorphism-aware design" approach that strictly aligns the AI's human-like cues (appearance, social behaviors) with its actual functional capabilities. This prevents the formation of incorrect and inappropriate user expectations (e.g., companionship) by making the system's inherent limitations and non-sentient nature explicitly transparent to the user. 2. Employ Proactive Expectation Management: Integrate real-time expectation management techniques, such as uncertainty visualization, into the conversational flow. This mechanism should clearly signal to the user when the model is operating at the boundary of its architecture or is likely to generate non-coherent material, thereby pre-emptively managing the 'expectation-reality gap' and maintaining appropriate trust calibration. 3. Develop Systemic Failure Recovery Protocols: Institute automated "repairing strategies" to be deployed immediately following a detected expectation violation event (e.g., an abrupt shift to nonsensical output). These protocols must focus on restoring user trust and mitigating the resultant negative psychological impacts (disappointment, betrayal) by clearly communicating the nature of the system failure and guiding the user back to effective engagement.