Explainability
A recurrent concern about AI algorithms is the lack of explainability for the model, which means information about how the algorithm arrives at its results is deficient (Deeks, 2019). Specifically, for generative AI models, there is no transparency to the reasoning of how the model arrives at the results (Dwivedi et al., 2023). The lack of transparency raises several issues. First, it might be difficult for users to interpret and understand the output (Dwivedi et al., 2023). It would also be difficult for users to discover potential mistakes in the output (Rudin, 2019). Further, when the interpretation and evaluation of the output are inaccessible, users may have problems trusting the system and their responses or recommendations (Burrell, 2016). Additionally, from the perspective of law and regulations, it would be hard for the regulatory body to judge whether the generative AI system is potentially unfair or biased (Rieder & Simon, 2017).
ENTITY
3 - Other
INTENT
2 - Unintentional
TIMING
2 - Post-deployment
Risk ID
mit543
Domain lineage
7. AI System Safety, Failures, & Limitations
7.4 > Lack of transparency or interpretability
Mitigation strategy
1. Employ Model-Agnostic Post-Hoc Explainability Techniques: Implement methods such as LIME (Local Interpretable Model-agnostic Explanations) and SHAP (SHapley Additive exPlanations) to provide local, instance-level justifications for generative AI outputs. Concurrently, mandate intrinsic transparency through documentation (e.g., Model Cards) detailing the model's architecture, training data sources, known limitations, and intended use to provide global context for its behavior. 2. Establish Continuous Governance and Auditing for Fairness: Institute a continuous monitoring framework to systematically audit model outputs for latent biases and unintended consequences, using fairness metrics (e.g., Disparate Impact, Equalized Odds). This mitigates the regulatory risk by ensuring the system does not produce unexplainably unfair outcomes and provides auditable evidence for regulatory compliance. 3. Integrate Human-in-the-Loop (HITL) Validation: Implement a supervisory structure where human experts are required to review, validate, and potentially override decisions or content generated by the system, particularly in high-risk or critical application domains (e.g., medical, legal). This operational measure enhances safety, builds stakeholder trust, and serves as an essential check against uninterpretable errors.