1. Discrimination & Toxicity1 - Pre-deployment

Algorithm and data

More than 20% of the contributions are centered on the ethical dimensions of algorithms and data. This theme can be further categorized into two main subthemes: data bias and algorithm fairness, and algorithm opacity.

Source: MIT AI Risk Repositorymit578

ENTITY

1 - Human

INTENT

1 - Intentional

TIMING

1 - Pre-deployment

Risk ID

mit578

Domain lineage

1. Discrimination & Toxicity

156 mapped risks

1.1 > Unfair discrimination and misrepresentation

Mitigation strategy

Priority 1: Data-Centric Pre-processing and Curation Implement rigorous data governance to ensure the training data is both diverse and statistically representative of all intended demographic and protected subgroups. This involves collecting data from multiple sources, employing techniques like reweighting or oversampling to balance data distributions, and using blinded annotation protocols to mitigate human bias during labeling. Priority 2: Fairness-Aware Algorithmic Integration Integrate explicit fairness constraints directly into the model's objective function during the training phase (in-processing). Utilize advanced fairness-aware machine learning techniques such as adversarial debiasing or fair representation learning to minimize the dependency of the model's outcomes on sensitive attributes, ensuring the algorithm cannot learn or perpetuate discriminatory patterns. Priority 3: Continuous Auditing and Governance Establish a continuous monitoring and auditing framework to detect bias drift and ensure sustained fairness post-deployment. This includes conducting regular, automated fairness checks using statistical metrics (e.g., disparate impact, equalized odds) across all user subgroups and implementing transparent processes, such as explainable AI (XAI) and a Human-in-the-Loop (HITL) mechanism, for critical or potentially high-risk decisions.