Lack of data understanding
The correct understanding of the used data for developing an AI system is a prerequisite to avoid data shortcomings and hinders the development of an AI system which is best suiting for the intended functionality.
ENTITY
1 - Human
INTENT
2 - Unintentional
TIMING
1 - Pre-deployment
Risk ID
mit1000
Domain lineage
7. AI System Safety, Failures, & Limitations
7.0 > AI system safety, failures, & limitations
Mitigation strategy
1. Establish a Formal Data Governance and Stewardship Program This program must define clear policies for data acquisition, curation, validation, and maintenance, and assign data stewards responsible for the quality, context, and fitness-for-purpose of specific datasets, which is foundational to a correct data understanding 2. Mandate Comprehensive Data Discovery and Quality Audits Implement automated and continuous data profiling, lineage tracking, and auditing mechanisms to systematically assess data completeness, accuracy, and representativeness *prior* to AI system development, empirically validating the model development team's understanding of the data's characteristics 3. Develop and Maintain a Centralized Data Catalog with Rich Metadata Enforce the creation of detailed metadata (definitions, lineage, data schemas, known biases/limitations) for all training data assets. This standardized documentation ensures that all AI actors share a correct and current operational understanding of the underlying data