7. AI System Safety, Failures, & Limitations1 - Pre-deployment

Risks from data (Risks of unregulated training data annotation)

Issues with training data annotation, such as incomplete annotation guidelines, incapable annotators, and errors in annotation, can affect the accuracy, reliability, and effectiveness of models and algorithms. Moreover, they can introduce training biases, amplify discrimination, reduce generalization abilities, and result in incorrect outputs.

Source: MIT AI Risk Repositorymit689

ENTITY

1 - Human

INTENT

2 - Unintentional

TIMING

1 - Pre-deployment

Risk ID

mit689

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.3 > Lack of capability or robustness

Mitigation strategy

1. Develop and enforce highly detailed, unambiguous annotation guidelines and protocols, including explicit handling of edge cases, visual examples, and a formal version control system, to minimize ambiguity and ensure consistency across all data points. 2. Institute a comprehensive annotator training and calibration program, focusing on domain-specific knowledge, the rationale behind the guidelines, and continuous bias awareness, to ensure a shared understanding and high initial labeling accuracy. 3. Implement a rigorous, multi-level Quality Assurance (QA) framework utilizing systematic checks such as Inter-Annotator Agreement (IAA) metrics and Gold Standard datasets, with a continuous feedback mechanism to drive iterative refinement of guidelines and annotator performance.