Back to the MIT repository
1. Discrimination & Toxicity3 - Other

Disparate Performance

In the context of evaluating the impact of generative AI systems, disparate performance refers to AI systems that perform differently for different subpopulations, leading to unequal outcomes for those groups.

Source: MIT AI Risk Repositorymit169

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

3 - Other

Risk ID

mit169

Domain lineage

1. Discrimination & Toxicity

156 mapped risks

1.3 > Unequal performance across groups

Mitigation strategy

1. Prioritize and Mandate Bias Mitigation in Governance: Establish a formal governance framework, commencing at the model's conception phase, to integrate Diversity, Equity, and Inclusion (DEI) principles and ensure leadership commitment to funding and mandating bias-mitigation initiatives throughout the entire AI system lifecycle. 2. Ensure Training Data Representativeness: Implement a systematic strategy to collect, analyze, and curate training datasets to ensure they are diverse and representative of all target subpopulations. This must include employing techniques such as reweighting, resampling (e.g., oversampling the minority class), or demography-aware synthetic data generation to counteract demographic imbalance and data sparsity. 3. Implement Fairness-Aware Algorithmic Constraints: Integrate advanced fairness constraints during the model training process, utilizing techniques such as adversarial debiasing, fair representation learning, or fair regularization, to minimize biased outcomes and promote equitable system performance across identified sensitive subpopulations.

ADDITIONAL EVIDENCE

A model that is trained on a dataset that is disproportionately skewed towards one particular demographic group may perform poorly for other demographic groups [43]. Data availability differs due to geographic biases in data collection [216], disparate digitization of content globally due to varying levels of internet access for digitizing content, and infrastructure created to support some languages or accents over others, among other reasons. Much of the training data for state of art generative models comes from the internet. However, the composition of this data reflects historical usage patterns; 5% of the world speaks English at home, yet 63.7% of internet communication is in English [197]. This has implications for downstream model performance where models underperform on parts of the distribution underrepresented in the training set. For example, automatic speech recognition models (ASR), which convert spoken language (audio) to text have been shown to exhibit racial disparities [130], forcing people to adapt to engage with such systems [100] and has implications (see 4.2.3.2 Imposing Norms and Values) for popular audio generation accent representation. Interventions to mitigate harms caused by generative AI systems may also introduce and exhibit disparate performance issues [238]. For instance, automated hate speech detection driven by annotated data with an insensitivity to dialect differences can amplify harm to minority or marginalized groups by silencing their voices (see 4.2.2.1 Community Erasure) or incorrectly labeling their speech as offensive [67]. This therefore requires that the interventions used are documented for which particular populations and norms that they seek to cover, and which they do not