2. Privacy & Security1 - Pre-deployment

Privacy and Data Leakage

Large pre-trained models trained on internet texts might contain private information like phone numbers, email addresses, and residential addresses.

Source: MIT AI Risk Repositorymit68

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

1 - Pre-deployment

Risk ID

mit68

Domain lineage

2. Privacy & Security

186 mapped risks

2.1 > Compromise of privacy by leaking or correctly inferring sensitive information

Mitigation strategy

1. **Implement Robust PII Sanitization and Redaction Protocols.** Apply advanced Named Entity Recognition (NER) and contextual analysis tools to systematically identify and redact or replace all Personally Identifiable Information (PII) and Potentially Sensitive Information (PSI) within the massive pre-training data corpus. This is a critical pre-deployment defense to ensure that only anonymized text is used to train the model, thereby preventing the memorization and subsequent regurgitation of private data. 2. **Integrate Differential Privacy (DP) into the Training Process.** Apply formal privacy-preserving techniques, such as Differential Privacy, during the model's pre-training phase. This involves introducing controlled noise to the gradient updates during optimization, which statistically limits the influence of any single data point on the final model parameters. This algorithmic mitigation enhances robustness against membership inference and data extraction attacks by making it computationally infeasible to link generated output back to specific records in the training set. 3. **Enforce Strict Data Governance through Source Selection and Automated Filtering.** Prioritize the collection of training data from authoritative, high-quality, and non-sensitive sources, and implement multi-layered automated filtering techniques to detect and remove unreliable, low-quality, or overtly personal/sensitive content before ingestion. This upstream data curation strategy minimizes the initial exposure to private information from indiscriminate web scraping, reducing the volume of sensitive data that requires subsequent sanitization.