1. Discrimination & Toxicity2 - Post-deployment

Lower performance for some languages and social groups

LMs are typically trained in few languages, and perform less well in other languages [95, 162]. In part, this is due to unavailability of training data: there are many widely spoken languages for which no systematic efforts have been made to create labelled training datasets, such as Javanese which is spoken by more than 80 million people [95]. Training data is particularly missing for languages that are spoken by groups who are multilingual and can use a technology in English, or for languages spoken by groups who are not the primary target demographic for new technologies.

Source: MIT AI Risk Repositorymit209

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit209

Domain lineage

1. Discrimination & Toxicity

156 mapped risks

1.3 > Unequal performance across groups

Mitigation strategy

1. Community-Centric Data Curation and Corpus Balancing Initiate collaborative, community-driven data collection programs to acquire authentic, high-quality corpora for low-resource languages, dialects, and sociolects. Subsequently, employ advanced data pre-processing techniques, such as Counterfactual Data Augmentation (CDA), to systematically balance the training corpus and mitigate the inherent underrepresentation of marginalized linguistic variations, thereby ensuring a more equitable foundation for model learning. 2. Targeted Fine-Tuning via Transfer Learning Utilize established multilingual models as a base and apply targeted fine-tuning (e.g., using Parameter-Efficient Fine-Tuning (PEFT) methods) with the newly curated low-resource datasets. This strategy leverages the generalized linguistic knowledge acquired from high-resource data while specializing the model's capabilities to the unique morphological, syntactic, and semantic intricacies of the underserved language or social group, thereby enhancing local performance and accuracy. 3. Continuous Cross-Group Disparity Monitoring Implement a rigorous, real-time observability and Human-in-the-Loop (HITL) system to continuously measure and track performance metrics—such as response quality, accuracy, and bias scores—across all defined social and linguistic subgroups. This monitoring ensures that mitigation actions are effective, identifies emerging performance disparities (drift), and triggers necessary model refinement or System Prompt Engineering adjustments to maintain equitable and consistent service quality post-deployment.

ADDITIONAL EVIDENCE

Training data can also be lacking when relatively little digitised text is available in a language, e.g. Seychellois Creole [95]. Disparate performance can also occur based on slang, dialect, sociolect, and other aspects that vary within a single language [23]. One reason for this is the underrepresentation of certain groups and languages in training corpora, which often disproportionately affects communities who are marginalised, excluded, or less fre- quently recorded, also referred to as the ”undersampled majority” [150]