1. Discrimination & Toxicity2 - Post-deployment

Denying people the opportunity to self-identify

complex and non-traditional ways in which humans are represented and classified automatically, and often at the cost of autonomy loss... such as categorizing someone who identifies as non-binary into a gendered category they do not belong ... undermines people’s ability to disclose aspects of their identity on their own terms

Source: MIT AI Risk Repositorymit138

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit138

Domain lineage

1. Discrimination & Toxicity

156 mapped risks

1.1 > Unfair discrimination and misrepresentation

Mitigation strategy

1. Implement rigorous training data curation and filtering protocols to remove or redact automatically inferred identity labels (e.g., gender, religion, character traits) where such labels are not self-reported or necessary for the system's function. 2. Integrate explicit and accessible user feedback and self-correction mechanisms into the system interface to allow individuals to review, override, or opt-out of any automatically applied identity classifications, thereby restoring autonomy. 3. Employ proactive mitigation strategies, such as constrained decoding or sophisticated prompt engineering (e.g., Proactive Guidance), to restrict the model's output from generating narrow, binary, or stereotypical identity categories, particularly for attributes that are non-visually determinable.

ADDITIONAL EVIDENCE

It's definitely frustrating having [classifiers] get integral parts of my identity wrong. And I find it frustrating that these sorts of apps only tend to recognize two binary genders