Human-Computer Interaction Harms
Harms that arise from users overly trusting the language model, or treating it as human-like
ENTITY
3 - Other
INTENT
2 - Unintentional
TIMING
2 - Post-deployment
Risk ID
mit249
Domain lineage
5. Human-Computer Interaction
5.1 > Overreliance and unsafe use
Mitigation strategy
1. Implement comprehensive transparency protocols, including persistent UI messaging and first-run experiences, to ensure users develop realistic mental models concerning the generative nature, inherent capabilities, and known limitations of the language model, thereby counteracting the propensity for undue overreliance. 2. Integrate verification aids, such as verifiable source citations and quantifiable uncertainty expressions (e.g., highlighting low-probability outputs), to signal to users when critical oversight is required and to facilitate the efficient verification of generated content, mitigating harm from plausible but erroneous outputs. 3. Enforce de-anthropomorphic design guidelines by explicitly and consistently disclosing the system's non-human identity, avoiding cognitive and emotional claims, and using mechanical terminology to describe system functions, thereby discouraging the formation of inappropriate emotional attachments or the treatment of the system as a social entity.
ADDITIONAL EVIDENCE
This section focuses on risks from language technologies that engage a user via dialogue and are built on language models (LMs). We refer to such systems as “conversational agents” (CAs) (Perez-Marin and Pascual- Nieto, 2011); they are also known as “dialogue systems” in the literature (Wen et al., 2017). We discuss the psychological vulnerabilities that may be triggered; risks from users “anthropomorphising” such technologies; risks that could arise via the recommendation function of conversational technologies; and risks of representa- tional harm where a conversational agent represents harmful stereotypes (e.g. when a “secretary agent” is by default represented as female).