Back to the MIT repository
1. Discrimination & Toxicity2 - Post-deployment

Discriminative data bias

Discriminative data bias describes the systematic discrimination of groups of persons in the form of data shortcomings, such as distributional representation or incorrectness. Data bias can manifest in the model and lead to unfair decisions if not appropriately treated. Note, that the term bias is often used in other contexts, such as data representation. However, these issues are treated by other AI hazards in this list.

Source: MIT AI Risk Repositorymit1001

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit1001

Domain lineage

1. Discrimination & Toxicity

156 mapped risks

1.1 > Unfair discrimination and misrepresentation

Mitigation strategy

1. Prioritize Data Pre-processing for Fair Representation Implement rigorous data pre-processing techniques, such as reweighing or sampling (e.g., oversampling the minority group or undersampling the majority group), to adjust the training data distribution. Concurrently, conduct comprehensive data audits to identify and rectify shortcomings, including incomplete, obsolete, or disproportionately represented data, ensuring all sensitive groups are adequately and accurately represented. 2. Integrate In-processing Fairness Constraints During Model Training Employ in-processing methods by incorporating explicit fairness constraints or regularization terms into the machine learning model's loss function. Techniques such as the addition of a fairness penalty term or using adversarial debiasing should be utilized to minimize the statistical dependence between sensitive features and model predictions, ensuring the learning algorithm actively mitigates discrimination. 3. Institute Continuous Algorithmic Audits and Governance Establish a robust AI governance framework that mandates continuous, regular algorithmic audits and performance testing across all specified demographic groups (bias detection). This framework must ensure that fairness metrics are systematically assessed post-deployment and that corrective post-processing adjustments are applied to model outputs when necessary to maintain equitable outcomes.