Benchmarking (Guideline contamination)
Guideline contamination refers to scenarios where instructions for the collec- tion, annotation, or use of the dataset are exposed to the model [170]. These instructions may contain explicit data-label pairs that can improve the model’s capabilities for the task.
ENTITY
1 - Human
INTENT
2 - Unintentional
TIMING
1 - Pre-deployment
Risk ID
mit1119
Domain lineage
6. Socioeconomic and Environmental
6.5 > Governance failure
Mitigation strategy
1. **Prioritize Dynamic Benchmarking Paradigms**: Shift from static evaluation sets to dynamic methods, such as continuously updating or algorithmically regenerating benchmark data, to ensure temporal and content separation from large language model training corpora. 2. **Institute Rigorous Data Segregation and Curation**: Implement strict governance protocols to prevent the exposure of benchmark instructions, data-label pairs, and metadata to the model's training and fine-tuning datasets, effectively achieving an air-gapped separation between evaluation and development assets. 3. **Employ Post-Hoc Contamination Detection**: Utilize advanced detection methodologies (e.g., matching-based or comparison-based analyses) following model training to empirically quantify the degree of data overlap and confirm the validity of performance metrics.