Back to the MIT repository
1. Discrimination & Toxicity3 - Other

Disparate Performance

The LLM’s performances can differ significantly across different groups of users. For example, the question-answering capability showed significant performance differences across different racial and social status groups. The fact-checking abilities can differ for different tasks and languages

Source: MIT AI Risk Repositorymit491

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

3 - Other

Risk ID

mit491

Domain lineage

1. Discrimination & Toxicity

156 mapped risks

1.3 > Unequal performance across groups

Mitigation strategy

1. **Rigorous Data Curation and Augmentation Protocols** Implement systematic data auditing to identify and correct imbalances, underrepresentation, or inherent biases within the training corpus. Proactive strategies, such as dataset augmentation, relabelling (e.g., massaging), or feature perturbation (e.g., Disparate Impact Remover), should be employed to ensure the training data is representative across all relevant user demographics and cultural contexts, thereby mitigating the core source of disparate performance. 2. **Integration of Fairness-Aware Optimization Functions** Apply in-processing bias mitigation techniques by modifying the model's loss function during the training phase. This involves incorporating fairness-aware algorithms such as MinDiff or Counterfactual Logit Pairing (CLP) to introduce penalties for prediction discrepancies across sensitive attributes or predefined data slices, explicitly optimizing the model toward a more equitable performance profile (e.g., achieving equalized odds or demographic parity). 3. **Continuous Performance Monitoring and Algorithmic Audits** Establish a robust governance framework that mandates continuous, multi-metric performance monitoring and auditing. This includes regularly evaluating the LLM's outputs against specialized benchmarks using diverse fairness metrics (e.g., those found in tools like AI Fairness 360) to detect post-deployment regressions or the emergence of new biases. Regular testing, including adversarial and prompt injection attempts, should focus on assessing the model's consistency and robustness across varying user groups.

ADDITIONAL EVIDENCE

There are multiple causes for the disparate performance, including the inherent difficulties in different tasks, the lack of particular dimensions of data, the imbalance in the training data, and the difficulty in understanding the cultural background of different societies