Lack of Interpretability
Due to the black box nature of most machine learning models, users typically are not able to understand the reasoning behind the model decisions
ENTITY
2 - AI
INTENT
2 - Unintentional
TIMING
2 - Post-deployment
Risk ID
mit498
Domain lineage
7. AI System Safety, Failures, & Limitations
7.4 > Lack of transparency or interpretability
Mitigation strategy
1. **Implement Model-Agnostic Post-Hoc Interpretability Techniques.** Employ advanced techniques such as SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME) to generate feature importance scores and localized justifications for individual predictions, thereby externalizing the logic of the complex "black-box" model without sacrificing predictive accuracy. 2. **Integrate Intrinsic Interpretability and Hybrid Architectures.** Where performance requirements permit, prioritize the deployment of inherently interpretable models (e.g., linear models, shallow decision trees) or incorporate transparent components, such as attention mechanisms in deep learning models, to design systems where the decision-making rationale is directly observable. 3. **Establish Comprehensive Transparency Documentation and Reporting.** Develop and maintain clear, accessible documentation and audit trails that detail the AI system's design, operational domain, input feature influence, and the formal justification for its outputs, thereby meeting stakeholder needs for trust, accountability, and regulatory compliance (e.g., the right to explanation).