7. AI System Safety, Failures, & Limitations2 - Post-deployment

Inconsistency

models could fail to provide the same and consistent answers to different users, to the same user but in different sessions, and even in chats within the sessions of the same conversation

Source: MIT AI Risk Repositorymit478

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit478

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.3 > Lack of capability or robustness

Mitigation strategy

1. Establish continuous performance monitoring with dedicated observability platforms to track and log output variability, stability metrics, and consistency degradation across different user sessions or identical inputs, ensuring timely detection of model drift. 2. Implement a robust human-in-the-loop validation process for high-stakes outputs, integrating expert review and validation against established knowledge bases to ensure logical coherence and factual accuracy before the model's response is finalized. 3. Apply advanced prompt engineering strategies, such as few-shot learning or explicit constraint setting, and utilize output aggregation (ensembling) across multiple model inferences to stabilize responses and mitigate sensitivity to minor semantic variations in the input query.