Representational Harms
beliefs about different social groups that reproduce unjust societal hierarchies
ENTITY
3 - Other
INTENT
2 - Unintentional
TIMING
2 - Post-deployment
Risk ID
mit133
Domain lineage
1. Discrimination & Toxicity
1.1 > Unfair discrimination and misrepresentation
Mitigation strategy
1. Establish a rigorous **measurement and mitigation praxis** that expands beyond purely behavioral observation to quantify the cognitive, psychological, and social effects of algorithmic outputs, ensuring that mitigation efforts address *how* social groups are represented (e.g., stereotyping, demeaning) rather than solely focusing on *who* is represented. Pre-processing techniques, such as collecting diverse, inclusive datasets and transforming features to reduce correlation with sensitive attributes, should be prioritized. 2. Implement **in-processing bias-aware algorithms** and fairness constraints during model training to actively counteract the perpetuation of unjust societal hierarchies. This includes employing techniques like fair representation learning to develop data representations that are invariant to sensitive attributes, thereby preventing the model from systematically encoding discriminatory patterns. 3. Mandate **continuous algorithmic audits and impact assessments** in the post-deployment phase to track the emergence of representational harms and stereotypes, especially those arising from non-gendered or non-explicitly-group-related prompts. Utilize explainable AI (XAI) and post-processing techniques to identify, diagnose, and swiftly rectify biased outputs to minimize cumulative reputational and societal harm.