7. AI System Safety, Failures, & Limitations2 - Post-deployment

Operational data issues

Until the deployment of the AI application into its operational environment, the AI system has been tested with a test set that aims to approximate the distribution of operational data. However, an unexpected deviation in this approximation can cause an AI application to behave unreliably. Therefore, its behavior under confrontation with operational data needs to be evaluated.

Source: MIT AI Risk Repositorymit1014

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit1014

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.3 > Lack of capability or robustness

Mitigation strategy

1. Establish a Continuous Socio-Technical Observability Framework Implement real-time monitoring of the deployed AI system to track technical metrics (e.g., model performance, data integrity, and throughput) alongside value indicators. Crucially, the system must employ statistical methods to detect and alert stakeholders to **distributional shift** (data and model drift) between the training environment and the live operational data, ensuring a scientifically sound basis for reliability assessment. 2. Execute Phased Operational Deployment with Concurrently Validated Testing Before committing to full operational release, the AI application should be subjected to a **Shadow Deployment** or a gradual, **phased rollout**. This process involves running the new model in parallel with the current system—or on a limited, low-risk segment of live data—to rigorously evaluate its behavior against the actual operational data distribution without impacting critical business processes. 3. Institutionalize an Automated, Data-Driven Model Retraining Pipeline Develop an MLOps (Machine Learning Operations) framework that supports the systematic and automated **retraining** of the model using newly validated operational data. This pipeline must be triggered by predefined metrics (e.g., detection of significant performance degradation or pronounced data drift) to ensure the AI model continuously adapts to the evolving characteristics of the production environment, thereby maintaining predictive robustness.