Risks from models and algorithms (Risks of explainability)
AI algorithms, represented by deep learning, have complex internal workings. Their black-box or grey-box inference process results in unpredictable and untraceable outputs, making it challenging to quickly rectify them or trace their origins for accountability should any anomalies arise.
ENTITY
2 - AI
INTENT
2 - Unintentional
TIMING
3 - Other
Risk ID
mit681
Domain lineage
7. AI System Safety, Failures, & Limitations
7.4 > Lack of transparency or interpretability
Mitigation strategy
1. Prioritize Explainable AI (XAI) Methodology Integration Systematically integrate post-hoc and intrinsic XAI techniques (e.g., SHAP, LIME, or attention mechanisms) to translate opaque model outputs into human-comprehensible explanations. This establishes clear traceability for both local (individual decision) and global (overall model) inference processes, directly addressing the untraceable nature of black-box outputs. 2. Mandate Comprehensive Model Governance and Documentation Establish a stringent AI governance framework that requires meticulous documentation of the AI system's architecture, training data provenance, feature engineering, and validation procedures across the entire lifecycle. This protocol ensures an auditable and accountable decision pathway, mitigating risks related to legal and regulatory compliance. 3. Institute Continuous Model Observability and Auditability Deploy dedicated AI observability and monitoring platforms to track model performance, detect model and data drift, and audit the stability of generated explanations in real-time. This continuous oversight enables the proactive identification and rapid correction of unpredictable, anomalous behavior in the production environment, reducing the challenge of rectification.