Stereotyping social groups
Stereotyping in an algorithmic system refers to how the system’s outputs reflect “beliefs about the characteristics, attributes, and behaviors of members of certain groups....and about how and why certain attributes go together
ENTITY
2 - AI
INTENT
2 - Unintentional
TIMING
2 - Post-deployment
Risk ID
mit134
Domain lineage
1. Discrimination & Toxicity
1.1 > Unfair discrimination and misrepresentation
Mitigation strategy
1. Implement continuous post-deployment auditing and monitoring of the algorithmic system's outputs, utilizing a comprehensive suite of fairness metrics (e.g., demographic parity, equalized odds) to detect and quantify stereotype-associated errors and bias drift across diverse and intersectional social groups. This process must incorporate robust feedback mechanisms to integrate real-world evidence and experiential harms reported by affected users. 2. Employ output-correction and inference-stage mitigation strategies, such as post-processing adjustments (e.g., the Randomized Threshold Optimizer or Calibrated Equalized Odds) or simple model prompting techniques, to immediately re-weight or modify predictions/generations found to reinforce or violate stereotypes, thereby reducing differential treatment and misrepresentation in the final system output. 3. Establish a mandatory human-in-the-loop governance structure that requires human oversight to review, validate, and override high-stakes AI-assisted decisions, coupled with a commitment to organizational transparency that clearly documents the AI's intended use, known bias limitations, and the results of all applied mitigation strategies.
ADDITIONAL EVIDENCE
Exclusionary norms [in language models] can manifest in 'subtle patterns' like referring to women doctors as if doctor itself entails not-woman