Back to the MIT repository
5. Human-Computer Interaction3 - Other

Attempts to fulfill inappropriate role

The chatbot poses as a human or attempts to fill a role in a way that fails to match human expectations.

Source: MIT AI Risk Repositorymit1417

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

3 - Other

Risk ID

mit1417

Domain lineage

5. Human-Computer Interaction

92 mapped risks

5.1 > Overreliance and unsafe use

Mitigation strategy

1. Establish Mandatory Transparency and Role Definition Implement continuous, prominent user-facing disclosures at the initiation of all interactions, explicitly stating that the system is an AI and not a human. Concurrently, define and communicate the system's precise job description, clarifying its functional scope, inherent capabilities, and limitations to align user expectations with the chatbot's designated non-human role. 2. Enforce Behavioral Alignment via Post-Training Controls Utilize advanced post-training methods, such as Reinforcement Learning from Human Feedback (RLHF), to fine-tune the model against behavioral criteria that penalize attempts to assume an unauthorized human identity or engage in sycophantic behavior. Integrate technical guardrails to constrain the response space, thereby preventing the generation of content that contradicts the system's defined, non-human role. 3. Implement Systematic Escalation Protocols Develop and maintain clear, pre-defined escalation pathways to human intervention, triggered by specific linguistic cues or interaction thresholds (e.g., complex, high-impact, or emotionally charged inquiries). This ensures that interactions requiring human professional judgment, legal authority, or empathetic understanding are efficiently transferred, preventing the AI from inappropriately fulfilling roles demanding human expertise.