Data drift
Data drift is a phenomenon in that distribution of operational input data departs from those used during training. This can cause a degradation in performance.
ENTITY
3 - Other
INTENT
3 - Other
TIMING
2 - Post-deployment
Risk ID
mit1015
Domain lineage
7. AI System Safety, Failures, & Limitations
7.3 > Lack of capability or robustness
Mitigation strategy
1. Implement continuous, automated monitoring systems to track key statistical properties and performance metrics (e.g., Population Stability Index, Kolmogorov-Smirnov test, and accuracy) of the input and output data against the training baseline. This ensures the early detection of distributional shifts and triggers timely alerts. 2. Establish a robust, scheduled, and event-triggered model retraining pipeline. Models should be periodically updated with new, high-quality data and automatically retrained when monitoring alerts indicate that drift has exceeded predefined tolerance thresholds. 3. Employ adaptive learning techniques, such as online or incremental learning, or utilize ensemble methods to enhance model robustness. These approaches allow the model to adjust continuously or leverage multiple models to mitigate the detrimental impact of ongoing or mixed types of data drift.