7. AI System Safety, Failures, & Limitations3 - Other

Opacity (the black box problem)

Opacity surrounding the technical, internal decision-making processes of generative AI models is popularly known as the “black box problem.”277 Generative AI models, most ubiquitously built on deep neural networks with hundreds of billions of internal connections,278 have become so complex that their internal decision-making processes are no longer traceable or interpretable to even the most advanced expert observers. This means that, while the inputs and outputs of a system can be observed, developers cannot explain in detail why specific inputs correspond to specific outputs.

Source: MIT AI Risk Repositorymit727

ENTITY

3 - Other

INTENT

2 - Unintentional

TIMING

3 - Other

Risk ID

mit727

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.4 > Lack of transparency or interpretability

Mitigation strategy

1. Adopt and deploy Explainable AI (XAI) methodologies, such as post-hoc techniques (e.g., SHAP, LIME) or intrinsically interpretable 'glass-box' models, to ensure that the rationale behind specific model outputs is traceable and human-understandable. 2. Mandate rigorous audit processes, including detailed logging and versioning of all model inputs, outputs, and decision-making steps, to establish a verifiable and complete audit trail for regulatory compliance and post-incident analysis. 3. Institute robust, scenario-based testing frameworks (e.g., stress testing and security testing) combined with model validation protocols to detect anomalies, systemic biases, and vulnerabilities, thereby ensuring the model's behavior is predictable and reliable across all operational contexts.