2. Privacy & Security1 - Pre-deployment

Private Training Data

As recent LLMs continue to incorporate licensed, created, and publicly available data sources in their corpora, the potential to mix private data in the training corpora is significantly increased. The misused private data, also named as personally identifiable information (PII) [84], [86], could contain various types of sensitive data subjects, including an individual person’s name, email, phone number, address, education, and career. Generally, injecting PII into LLMs mainly occurs in two settings — the exploitation of web-collection data and the alignment with personal humanmachine conversations [87]. Specifically, the web-collection data can be crawled from online sources with sensitive PII, and the personal human-machine conversations could be collected for SFT and RLHF

Source: MIT AI Risk Repositorymit32

ENTITY

1 - Human

INTENT

2 - Unintentional

TIMING

1 - Pre-deployment

Risk ID

mit32

Domain lineage

2. Privacy & Security

186 mapped risks

2.1 > Compromise of privacy by leaking or correctly inferring sensitive information

Mitigation strategy

1. Rigorous Pre-training Data Minimization and Sanitization Implement robust data engineering protocols for the preprocessing pipeline, including the automatic identification, filtering, and sanitization of Personally Identifiable Information (PII) and other sensitive data categories via anonymization, pseudonymization, or data generalization prior to model ingestion. Concurrently, conduct thorough data deduplication across the corpus to minimize the probability of verbatim memorization of unique data sequences by the large language model (LLM). 2. Integration of Privacy-Preserving Machine Learning Architectures Integrate advanced privacy-preserving frameworks, specifically Differential Privacy (DP) mechanisms, directly into the model training process, particularly during fine-tuning or collaborative learning (e.g., Federated Learning), to formally bound the influence of any single data point and mathematically restrict the potential for training data extraction attacks. 3. Strict Security Posture and Access Governance Enforce a comprehensive security architecture that mandates end-to-end encryption for all sensitive training data (at rest and in transit) using ratified industry standards (e.g., AES-256). This must be complemented by granular Role-Based Access Control (RBAC) applied to datasets and model checkpoints, alongside mandatory audit trails, to mitigate insider threats and ensure full traceability and accountability of data access.