Back to the MIT repository
7. AI System Safety, Failures, & Limitations1 - Pre-deployment

Lack of data understanding

The correct understanding of the used data for developing an AI system is a prerequisite to avoid data shortcomings and hinders the development of an AI system which is best suiting for the intended functionality.

Source: MIT AI Risk Repositorymit1000

ENTITY

1 - Human

INTENT

2 - Unintentional

TIMING

1 - Pre-deployment

Risk ID

mit1000

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.0 > AI system safety, failures, & limitations

Mitigation strategy

1. Establish a Formal Data Governance and Stewardship Program This program must define clear policies for data acquisition, curation, validation, and maintenance, and assign data stewards responsible for the quality, context, and fitness-for-purpose of specific datasets, which is foundational to a correct data understanding 2. Mandate Comprehensive Data Discovery and Quality Audits Implement automated and continuous data profiling, lineage tracking, and auditing mechanisms to systematically assess data completeness, accuracy, and representativeness *prior* to AI system development, empirically validating the model development team's understanding of the data's characteristics 3. Develop and Maintain a Centralized Data Catalog with Rich Metadata Enforce the creation of detailed metadata (definitions, lineage, data schemas, known biases/limitations) for all training data assets. This standardized documentation ensures that all AI actors share a correct and current operational understanding of the underlying data