6. Socioeconomic and Environmental2 - Post-deployment

Benchmarking (Post-deployment contamination)

Once a model is deployed, it can be exposed to benchmark data provided by the users [95, 170]. The model may then be further trained by these user inputs containing benchmark data.

Source: MIT AI Risk Repositorymit1121

ENTITY

3 - Other

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit1121

Domain lineage

6. Socioeconomic and Environmental

262 mapped risks

6.5 > Governance failure

Mitigation strategy

1. Adopt a Dynamic, Contamination-Controlled Evaluation Framework Implement dynamic benchmarks and generative test evolution protocols (e.g., Knowledge-enhanced Benchmark Evolution) where evaluation samples are continuously updated or synthetically generated. This ensures that the test data remains temporally and syntactically novel, preventing their inclusion in the post-deployment user feedback or subsequent fine-tuning corpora. 2. Implement Robust Data Filtering and Sanitization for User Feedback Establish a rigorous data-handling pipeline for all user inputs and post-deployment data streams. This pipeline must incorporate real-time filtering mechanisms, such as n-gram overlap detection or membership inference, to actively identify and quarantine any data that matches known public or proprietary benchmark samples before they can be used for model fine-tuning or re-training. 3. Conduct Continuous and Quantitative Contamination Audits Mandate the regular application of formal contamination detection metrics, such as the Kernel Divergence Score, to continuously quantify the degree of benchmark leakage. These audits provide a reliable, objective measure of evaluation integrity and should serve as a mandatory gate for any model version release or performance claim.