Back to the MIT repository
1. Discrimination & Toxicity3 - Other

Bias

In the context of AI, the concept of bias refers to the inclination that AIgenerated responses or recommendations could be unfairly favoring or against one person or group (Ntoutsi et al., 2020). Biases of different forms are sometimes observed in the content generated by language models, which could be an outcome of the training data. For example, exclusionary norms occur when the training data represents only a fraction of the population (Zhuo et al., 2023). Similarly, monolingual bias in multilingualism arises when the training data is in one single language (Weidinger et al., 2021). As ChatGPT is operating across the world, cultural sensitivities to different regions are crucial to avoid biases (Dwivedi et al., 2023). When AI is used to assist in decision-making across different stages of employment, biases and opacity may exist (Chan, 2022). Stereotypes about specific genders, sexual orientations, races, or occupations are common in recommendations offered by generative AI. Hence, the representativeness, completeness, and diversity of the training data are essential to ensure fairness and avoid biases (Gonzalez, 2023). The use of synthetic data for training can increase the diversity of the dataset and address issues with sample-selection biases in the dataset (owing to class imbalances) (Chen et al., 2021). Generative AI applications should be tested and evaluated by a diverse group of users and subject experts. Additionally, increasing the transparency and explainability of generative AI can help in identifying and detecting biases so appropriate corrective measures can be taken.

Source: MIT AI Risk Repositorymit535

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

3 - Other

Risk ID

mit535

Domain lineage

1. Discrimination & Toxicity

156 mapped risks

1.1 > Unfair discrimination and misrepresentation

Mitigation strategy

1. Prioritize Diverse and Representative Data Curation: Establish a comprehensive data governance policy mandating that all training and fine-tuning datasets are critically assessed for representativeness and completeness across all salient demographic and cultural dimensions. Proactively employ data preprocessing methods, such as reweighting, resampling, or the generation of synthetic data, to actively mitigate sample-selection biases, address class imbalances, and counter exclusionary norms before model ingestion. 2. Implement Fairness-Aware Algorithmic Interventions: Integrate formal bias mitigation techniques directly into the model development lifecycle. This involves the application of fairness constraints, adversarial debiasing frameworks, or fair representation learning algorithms to decouple model predictions from sensitive attributes (e.g., race, gender, occupation) during the training phase, thereby ensuring equal opportunity and parity of outcome metrics. 3. Establish Robust Governance and Continuous Oversight: Institute systemic controls, including a mandatory Human-in-the-Loop review process for all high-impact AI-aided decisions, and adhere to Explainable AI (XAI) protocols to maximize transparency in the decision-making rationale. Furthermore, commission continuous post-deployment audits and monitoring, using defined fairness metrics, to detect and correct for bias drift or the emergence of new discriminatory patterns over the system's operational lifespan.