7. AI System Safety, Failures, & Limitations1 - Pre-deployment

Uncertainty concerns

AI systems should be able not only to return output for a given instance but also to provide a corresponding level of confidence. If such a method is not implemented or not working correctly, this can have a negative impact on performance and safety.

Source: MIT AI Risk Repositorymit1013

ENTITY

1 - Human

INTENT

2 - Unintentional

TIMING

1 - Pre-deployment

Risk ID

mit1013

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.0 > AI system safety, failures, & limitations

Mitigation strategy

1. Implement rigorous Uncertainty Quantification (UQ) and calibration techniques, such as Conformal Prediction or post-hoc methods like Platt scaling, to ensure the model's assigned confidence scores reliably reflect the true probability of prediction correctness. 2. Establish and enforce confidence-based fail-safe design principles, mandating a Human-in-the-Loop intervention or system deferral for any output below a pre-defined critical confidence threshold, particularly in high-stakes domains. 3. Integrate and prioritize uncertainty-aware evaluation metrics into the model lifecycle, utilizing benchmarks that reward models for appropriate abstention or well-calibrated confidence rather than solely maximizing predictive accuracy.