Back to the MIT repository
7. AI System Safety, Failures, & Limitations1 - Pre-deployment

Data-related (Lack of cross-organizational documentation)

When sharing data between multiple organizations, documentation may be missing or inadequate, making it difficult for other organizations to understand it. For example, a lack of metadata or a change in schema by a collaborating party can result in an unusable dataset and wasted data collection efforts, or it can lead to misunderstandings about the dataset’s limitations, resulting in downstream risks related to its use [173].

Source: MIT AI Risk Repositorymit1095

ENTITY

1 - Human

INTENT

2 - Unintentional

TIMING

1 - Pre-deployment

Risk ID

mit1095

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.3 > Lack of capability or robustness

Mitigation strategy

1. Establish a Coordinated Cross-Organizational Data Governance Framework Implement a formal, mutually agreed-upon data governance structure, including a transparent decision-making process and comprehensive data sharing agreements (e.g., eMOU, data licenses) to define roles, responsibilities, and permissible use for all shared datasets across collaborating entities. 2. Mandate Standardized Metadata and Data Schemas Require the uniform application of established metadata standards and common data elements (CDEs) for all shared data assets. This ensures consistency in data structure, terminology, and content documentation, thereby clarifying data context, improving interoperability, and reducing ambiguity for downstream users. 3. Implement Continuous Data Lineage and Quality Management Utilize systems that track data lineage and traceability to provide a clear, auditable trail of a dataset's origin, transformations, and flow across organizational boundaries. This must be paired with continuous data quality monitoring to detect and alert stakeholders to schema changes, anomalies, or inconsistencies immediately upon occurrence.