Paradigm & Distribution Shifts
Knowledge bases that LLMs are trained on continue to shift... questions such as “who scored the most points in NBA history or “who is the richest person in the world might have answers that need to be updated over time, or even in real-time
ENTITY
2 - AI
INTENT
2 - Unintentional
TIMING
2 - Post-deployment
Risk ID
mit507
Domain lineage
3. Misinformation
3.1 > False or misleading information
Mitigation strategy
1. Implement Retrieval-Augmented Generation (RAG) and Limited Search Engine Integration to provide the Large Language Model (LLM) with access to real-time or near real-time external knowledge, directly mitigating risks associated with temporal drift in factual knowledge. 2. Establish Continuous Monitoring and Adaptive Alignment frameworks, including feedback loops and Key Risk Indicators (KRIs), to dynamically track performance degradation and policy non-compliance stemming from concept shifts, enabling prompt detection and remediation of emerging vulnerabilities. 3. Conduct Targeted Fine-Tuning and Adversarial Training using synthetically generated or collected out-of-distribution (OOD) and stylistically diverse data to enhance the model's generalization capacity and resilience against unforeseen shifts in user input or underlying data distributions.
ADDITIONAL EVIDENCE
Local policies (e.g. content moderation policies) change and adapt over time. For example, certain contents or subjects (e.g., LGBTQ-related identities) might pass a local content moderation policy and be considered proper at some point, but may contain a new offensive term and will no longer be so.