Miscalibration
over-confidence in topics where objective answers are lacking, as well as in areas where their inherent limitations should caution against LLMs’ uncertainty (e.g. not as accurate as experts)... ack of awareness regarding their outdated knowledge base about the question, leading to confident yet erroneous response
ENTITY
2 - AI
INTENT
2 - Unintentional
TIMING
2 - Post-deployment
Risk ID
mit479
Domain lineage
3. Misinformation
3.1 > False or misleading information
Mitigation strategy
1. Implement Retrieval-Augmented Generation (RAG) Implement Retrieval-Augmented Generation (RAG) to ensure model outputs are explicitly grounded in a verified, up-to-date external knowledge corpus. This directly mitigates overconfidence resulting from an outdated or limited internal knowledge base by providing current, factual context at the time of inference. 2. Integrate Robust Confidence Calibration and Uncertainty Quantification (UQ) Integrate and optimize advanced confidence calibration and uncertainty quantification (UQ) mechanisms. Focus on aligning the model's predicted confidence score (miscalibration) with its empirical correctness, preferably leveraging internal metrics such as response token probability, to establish a reliable threshold for refraining from or flagging overconfident, erroneous responses. 3. Employ Structured Reasoning Prompting and Targeted Fine-Tuning Employ advanced prompt engineering, specifically Chain-of-Thought (CoT) prompting, to compel the model to articulate its reasoning and source citation steps, thereby increasing logical rigor and traceability. Concurrently, apply targeted fine-tuning on domain-specific, high-quality, and curated datasets to reduce the intrinsic propensity for probabilistic hallucination.