Back to the MIT repository
1. Discrimination & Toxicity2 - Post-deployment

Unfair capability distribution

Performing worse for some groups than others in a way that harms the worse-off group

Source: MIT AI Risk Repositorymit1357

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit1357

Domain lineage

1. Discrimination & Toxicity

156 mapped risks

1.3 > Unequal performance across groups

Mitigation strategy

1. Implement **Post-Processing Calibration Techniques** such as **Threshold Adjustment** to achieve equitable performance outcomes. This involves calculating and applying distinct prediction thresholds for disadvantaged subgroups to minimize formal fairness metrics like the Equal Opportunity Difference (EOD) or Statistical Parity Difference. 2. Integrate **In-Processing Fairness-Aware Algorithms** during the model training phase. This includes using **Fair Regularization**—by adding a penalty term for discrimination to the loss function—or employing **Adversarial Debiasing** to enforce representational independence from sensitive attributes, thereby mitigating the learning of unfair capability distributions. 3. Establish a rigorous **Continuous Auditing and Monitoring** framework for the deployed system. This framework must routinely measure and track model performance and specified fairness metrics across all identified subgroups to proactively detect and remediate any emerging **bias drift** or degradation in equitable service provision.