3. Misinformation2 - Post-deployment

Paradigm & Distribution Shifts

Knowledge bases that LLMs are trained on continue to shift... questions such as “who scored the most points in NBA history or “who is the richest person in the world might have answers that need to be updated over time, or even in real-time

Source: MIT AI Risk Repositorymit507

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit507

Domain lineage

3. Misinformation

74 mapped risks

3.1 > False or misleading information

Mitigation strategy

1. Implement Retrieval-Augmented Generation (RAG) and Limited Search Engine Integration to provide the Large Language Model (LLM) with access to real-time or near real-time external knowledge, directly mitigating risks associated with temporal drift in factual knowledge. 2. Establish Continuous Monitoring and Adaptive Alignment frameworks, including feedback loops and Key Risk Indicators (KRIs), to dynamically track performance degradation and policy non-compliance stemming from concept shifts, enabling prompt detection and remediation of emerging vulnerabilities. 3. Conduct Targeted Fine-Tuning and Adversarial Training using synthetically generated or collected out-of-distribution (OOD) and stylistically diverse data to enhance the model's generalization capacity and resilience against unforeseen shifts in user input or underlying data distributions.

ADDITIONAL EVIDENCE

Local policies (e.g. content moderation policies) change and adapt over time. For example, certain contents or subjects (e.g., LGBTQ-related identities) might pass a local content moderation policy and be considered proper at some point, but may contain a new offensive term and will no longer be so.