Risks from models and algorithms (Risks of bias and discrimination)
During the algorithm design and training process, personal biases may be introduced, either intentionally or unintentionally. Additionally, poor-quality datasets can lead to biased or discriminatory outcomes in the algorithm's design and outputs, including discriminatory content regarding ethnicity, religion, nationality, and region.
ENTITY
1 - Human
INTENT
3 - Other
TIMING
1 - Pre-deployment
Risk ID
mit682
Domain lineage
1. Discrimination & Toxicity
1.1 > Unfair discrimination and misrepresentation
Mitigation strategy
1. Mandate Diverse Data Governance and Problem Framing Implement rigorous data quality and curation protocols to ensure training datasets are diverse, representative of the target population, and free from historical or sampling biases. This necessitates a multidisciplinary problem-framing stage, involving social scientists and domain experts, to explicitly define fair outcomes and avoid the use of proxy variables that may inadvertently encode systemic inequities. 2. Employ Bias-Aware Algorithmic Techniques Utilize in-processing or post-processing algorithmic adjustments, such as fairness-aware optimization functions (e.g., MinDiff or Counterfactual Logit Pairing) or the incorporation of explicit fairness constraints. These techniques should be systematically applied during model training to mitigate the influence of sensitive attributes and promote equitable performance across demographic subgroups. 3. Establish Continuous Auditing and Human Oversight Institute a robust AI governance framework that mandates regular, independent bias audits throughout the entire AI lifecycle, commencing at the data collection phase and continuing through post-deployment monitoring. For high-stakes applications, ensure mandatory human-in-the-loop oversight to critically review and validate algorithmic decisions, thereby acting as a crucial safeguard against the perpetuation of discriminatory outputs.