Benchmarking (Benchmark leakage or data contamination)
Benchmark leakage [235, 224, 221, 161] can happen when an AI model is trained or fine-tuned with evaluation-related data. This can lead to an unreliable model evaluation, especially if the data contains question-answer pairs from bench- marks.
ENTITY
1 - Human
INTENT
2 - Unintentional
TIMING
1 - Pre-deployment
Risk ID
mit1116
Domain lineage
6. Socioeconomic and Environmental
6.5 > Governance failure
Mitigation strategy
1. Utilize a strategy of Continuous Benchmark Renewal or Time-Gated Dataset Construction to ensure the evaluation set is composed of data published or created after the model's training data cut-off date, thereby structurally eliminating verbatim and near-identical contamination (Sources 4, 15, 16, 20). 2. Mandate the creation of a Benchmark Transparency Card at model release, which formally documents any utilization of standard benchmark datasets in the model's training or fine-tuning process to allow for fair and informed comparison of model performance and generalization capabilities (Source 4, 18). 3. Employ Counterfactual Rewriting Frameworks (e.g., LASTINGBENCH) to identify and perturb known leakage points within existing evaluation datasets, transforming them into semantically preserved yet counterfactual queries that disrupt latent memorization without altering the core evaluative intent of the benchmark (Source 2, 7).