Back to the MIT repository
3. Misinformation2 - Post-deployment

Factually incorrect content (inaccuracies and fabricated sources)

One of the most vexing problems associated with AI models is that they occasionally present false information as if it is factual—often with authoritative-sounding text and fabricated quotes and sources. This unpredictable phenomenon of generating false information is well known to AI researchers, who have termed such erroneous output with the euphemistic label “hallucination.”

Source: MIT AI Risk Repositorymit726

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit726

Domain lineage

3. Misinformation

74 mapped risks

3.1 > False or misleading information

Mitigation strategy

1. **Retrieval-Augmented Generation (RAG) Architecture:** Implement a robust Retrieval-Augmented Generation pipeline to ground all LLM outputs in verified, external knowledge sources. This architectural pattern mandates that the model retrieve factual information from a curated knowledge base at query time, anchoring the generation process to a verifiable source of truth and substantially reducing the probability of generating *de novo* fabricated content. 2. **Structured Reasoning and Prompt Constraint:** Utilize advanced prompt engineering techniques, such as Chain-of-Thought (CoT) prompting or Chain-of-Verification, to enforce a methodical, multi-step reasoning process. This constraint guides the LLM to articulate its logic and systematically verify intermediate claims, thereby discouraging unsubstantiated leaps in inference and reducing the likelihood of probabilistic factual errors. 3. **Mandatory Post-Processing and Human-in-the-Loop Verification:** Establish a final validation layer employing automated checks—including confidence score thresholds, factual alignment scoring, and cross-verification against a trusted knowledge graph—to detect potential hallucinations. Any high-risk or low-confidence output must be routed to a trained human expert for remediation before deployment, ensuring a critical human-in-the-loop safeguard against the dissemination of inaccurate information.