Lack of data transparency
Lack of data transparency is due to insufficient documentation of training or tuning dataset details.
ENTITY
1 - Human
INTENT
2 - Unintentional
TIMING
1 - Pre-deployment
Risk ID
mit1324
Domain lineage
6. Socioeconomic and Environmental
6.5 > Governance failure
Mitigation strategy
1. Establish and enforce a comprehensive Data Governance Framework that explicitly mandates standardized, detailed documentation (metadata) for all AI training and tuning datasets, including their origin, collection methods, curation processes, and any data augmentation or synthetic data generation steps. This structural policy ensures accountability and defines data stewardship roles responsible for documentation integrity. 2. Implement a robust Data Lineage and Metadata Management system to automatically track the complete lifecycle of training data assets, from ingestion and processing to model deployment. This system must provide an immutable audit trail for all modifications, enabling complete traceability and supporting the rationale behind model behavior (explainability). 3. Conduct mandatory, periodic, independent audits and reviews of all dataset documentation against established transparency requirements and regulatory standards (e.g., the EU AI Act, GDPR). The objective is to proactively identify and remediate documentation gaps, verify data representativeness, and validate the consistent application of fairness and security controls prior to pre-deployment.