5. Human-Computer Interaction2 - Post-deployment

Physical and Psychological Harms

These harms include harms to physical integrity, mental health and well-being. When interacting with vulnerable users, AI assistants may reinforce users’ distorted beliefs or exacerbate their emotional distress. AI assistants may even convince users to harm themselves, for example by convincing users to engage in actions such as adopting unhealthy dietary or exercise habits or taking their own lives. At the societal level, assistants that target users with content promoting hate speech, discriminatory beliefs or violent ideologies, may reinforce extremist views or provide users with guidance on how to carry out violent actions. In turn, this may encourage users to engage in violence or hate crimes. Physical harms resulting from interaction with AI assistants could also be the result of assistants’ outputting plausible yet factually incorrect information such as false or misleading information about vaccinations. Were AI assistants to spread anti-vaccine propaganda, for example, the result could be lower public confidence in vaccines, lower vaccination rates, increased susceptibility to preventable diseases and potential outbreaks of infectious diseases.

Source: MIT AI Risk Repositorymit392

ENTITY

2 - AI

INTENT

3 - Other

TIMING

2 - Post-deployment

Risk ID

mit392

Domain lineage

5. Human-Computer Interaction

92 mapped risks

5.1 > Overreliance and unsafe use

Mitigation strategy

1. Establish and enforce robust "human-in-the-loop" (HITL) and human oversight mechanisms, particularly for outputs concerning sensitive topics, health advice, or those directed at vulnerable users, ensuring human operators retain the authority to review, override, or disengage the AI system to prevent physical or psychological harm. 2. Adopt a secure-by-design methodology that includes rigorous adversarial testing ("red-teaming") of the model to assess vulnerabilities to manipulated input (e.g., prompt injection) and ensure the use of high-quality, validated training data to prevent the generation of harmful, discriminatory, or factually incorrect content. 3. Mandate clear documentation and transparency regarding the AI system's intended purpose, capabilities, and limitations, and implement user education programs to enhance media literacy, instructing consumers to verify the authenticity and veracity of AI-generated information to actively mitigate the risk of overreliance.