Bad advice/failure to generate helpful content
The chatbot gives guidance that ranges from simply unhelpful to harmful if acted on.
ENTITY
2 - AI
INTENT
2 - Unintentional
TIMING
3 - Other
Risk ID
mit1403
Domain lineage
7. AI System Safety, Failures, & Limitations
7.3 > Lack of capability or robustness
Mitigation strategy
1. Implement rigorous safety protocols and adversarial testing (red-teaming) to proactively identify and mitigate outputs that could result in physical, psychological, or financial harm, particularly in high-stakes domains such as mental health and medicine. This includes integrating pre-programmed fail-safe mechanisms to respond cautiously to crisis-simulating inputs and prevent over-validation of harmful ideations. 2. Institute mandatory governance and transparency measures requiring explicit disclaimers on all generated content, stating that the output is AI-generated, may contain inaccuracies (hallucinations), and is not a substitute for licensed professional advice. Simultaneously, establish human oversight protocols for reviewing high-impact or sensitive interactions. 3. Advance model robustness by refining training data and employing techniques to reduce factual inaccuracies and sycophantic behavior, such as using internal caution prompts to steer the model away from confident elaboration on dubious user inputs. This is complemented by promoting user education focused on critical thinking and source verification to diminish reliance on AI-generated responses.