Back to the MIT repository
3. Misinformation2 - Post-deployment

Hallucination

Hallucination is a widely recognized limitation of generative AI and it can include textual, auditory, visual or other types of hallucination (Alkaissi & McFarlane, 2023). Hallucination refers to the phenomenon in which the contents generated are nonsensical or unfaithful to the given source input (Ji et al., 2023). Azamfirei et al. (2023) indicated that fabricating information or fabrication is a better term to describe the hallucination phenomenon. Generative AI can generate seemingly correct responses yet make no sense. Misinformation is an outcome of hallucination. Generative AI models may respond with fictitious information, fake photos or information with factual errors (Dwivedi et al., 2023). Susarla et al. (2023) regarded hallucination as a serious challenge in the use of generative AI for scholarly activities. When asked to provide literature relevant to a specific topic, ChatGPT could generate inaccurate or even nonexistent literature. Current state-of-the-art AI models can only mimic human-like responses without understanding the underlying meaning (Shubhendu & Vijay, 2013). Hallucination is, in general, dangerous in certain contexts, such as in seeking advice for medical treatments without any consultation or thorough evaluation by experts, i.e., medical doctors (Sallam, 2023).

Source: MIT AI Risk Repositorymit541

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit541

Domain lineage

3. Misinformation

74 mapped risks

3.1 > False or misleading information

Mitigation strategy

1. Retrieval-Augmented Generation (RAG) Architecture Implementation Integrate Retrieval-Augmented Generation to ground LLM outputs with real-time, verified, domain-specific external knowledge bases. This countermeasure directly addresses knowledge boundary limitations and outdated training data, serving as a foundational architectural pattern for factual accuracy. 2. Structured Prompt Engineering and Epistemic Constraint Employ advanced prompt engineering techniques, such as Chain-of-Thought verification and explicit instructions, compelling the model to verify its claims step-by-step, cite sources, and state uncertainty or refuse to answer if information cannot be reliably grounded, thereby reducing confident, fabricated assertions. 3. Confidence Calibration and Verification Mechanisms Deploy automated post-processing and verification methods, including uncertainty estimation metrics (e.g., perplexity or semantic entropy) and human-in-the-loop validation, to quantify the factual reliability of generated content and flag low-confidence or potentially erroneous outputs for mandatory human review.