6. Socioeconomic and Environmental1 - Pre-deployment

General Evaluations (Biased evaluations of encoded human values)

Encoded human values in AI models that are easier to evaluate might be preferred for inclusion in evaluations over those that are more difficult to measure [13]. This might come at the expense of more desirable but harder-to-quantify values. This bias can lead to an imbalance, where easier-to-measure values dominate the evaluation process, while other important values are underrepresented.

Source: MIT AI Risk Repositorymit1114

ENTITY

1 - Human

INTENT

2 - Unintentional

TIMING

1 - Pre-deployment

Risk ID

mit1114

Domain lineage

6. Socioeconomic and Environmental

262 mapped risks

6.5 > Governance failure

Mitigation strategy

1. Mandate the use of structured human evaluation frameworks utilizing diverse reviewer panels to explicitly assess model performance against complex, difficult-to-quantify values and ethical considerations, ensuring that nuanced biases missed by automated metrics are captured. 2. Adopt and enforce disaggregated fairness metrics, such as Equalized Odds or demographic parity, to move beyond simple aggregated accuracy. This ensures that error rates and predictive outcomes are statistically equitable across all protected and underrepresented demographic and value-based subgroups. 3. Integrate algorithm-centric debiasing techniques, such as Fair Representation Learning or fairness constraints/regularization, during the model training phase to proactively encode and optimize for harder-to-measure values, thereby preventing proxy variables from dominating the learning process.