3. Misinformation2 - Post-deployment

Miscalibration

over-confidence in topics where objective answers are lacking, as well as in areas where their inherent limitations should caution against LLMs’ uncertainty (e.g. not as accurate as experts)... ack of awareness regarding their outdated knowledge base about the question, leading to confident yet erroneous response

Source: MIT AI Risk Repositorymit479

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit479

Domain lineage

3. Misinformation

74 mapped risks

3.1 > False or misleading information

Mitigation strategy

1. Implement Retrieval-Augmented Generation (RAG) Implement Retrieval-Augmented Generation (RAG) to ensure model outputs are explicitly grounded in a verified, up-to-date external knowledge corpus. This directly mitigates overconfidence resulting from an outdated or limited internal knowledge base by providing current, factual context at the time of inference. 2. Integrate Robust Confidence Calibration and Uncertainty Quantification (UQ) Integrate and optimize advanced confidence calibration and uncertainty quantification (UQ) mechanisms. Focus on aligning the model's predicted confidence score (miscalibration) with its empirical correctness, preferably leveraging internal metrics such as response token probability, to establish a reliable threshold for refraining from or flagging overconfident, erroneous responses. 3. Employ Structured Reasoning Prompting and Targeted Fine-Tuning Employ advanced prompt engineering, specifically Chain-of-Thought (CoT) prompting, to compel the model to articulate its reasoning and source citation steps, thereby increasing logical rigor and traceability. Concurrently, apply targeted fine-tuning on domain-specific, high-quality, and curated datasets to reduce the intrinsic propensity for probabilistic hallucination.