Back to the MIT repository
1. Discrimination & Toxicity2 - Post-deployment

Unintentional bias amplification

Dataset bias may be unintentionally amplified [60] where the outputs of the AI model trained on a dataset are more biased than the dataset itself.

Source: MIT AI Risk Repositorymit1202

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit1202

Domain lineage

1. Discrimination & Toxicity

156 mapped risks

1.1 > Unfair discrimination and misrepresentation

Mitigation strategy

1. **Prioritized Data Audit and Rebalancing (Pre-training)** Rigorous execution of Exploratory Data Analysis (EDA), rebalancing techniques (e.g., oversampling or generating synthetic data for underrepresented groups), and noise reduction to ensure the training dataset is statistically and contextually representative. This addresses the foundational source of systemic disparities that the model is poised to amplify. 2. **Algorithmic Mitigation during Training (In-processing)** Implementation of specific algorithmic strategies, such as integrating regularization terms that penalize dependence between sensitive features and model predictions (Prejudice Remover) or applying constraints-based optimization (Exponentiated Gradient Reduction) to satisfy predefined fairness metrics like Equalized Odds, thereby actively counteracting the model's tendency towards bias amplification during the learning phase. 3. **Longitudinal Bias Auditing and Human-in-the-Loop (Post-deployment)** Establish continuous, longitudinal monitoring protocols to track the deployed model's performance across various sensitive attributes using quantitative fairness metrics. This surveillance must be coupled with a Human-in-the-Loop (HITL) governance mechanism to provide contextual, ethical review of high-stakes or flagged decisions, ensuring swift intervention when evidence of post-deployment bias amplification or drift is detected.