2. Privacy & Security3 - Other

Privacy

Generative AI systems, similar to traditional machine learning methods, are considered a threat to privacy and data protection norms. A major concern is the intended extraction or inadvertent leakage of sensitive or private information from LLMs. To mitigate this risk, strategies such as sanitizing training data to remove sensitive information or employing synthetic data for training are proposed.

Source: MIT AI Risk Repositorymit74

ENTITY

3 - Other

INTENT

3 - Other

TIMING

3 - Other

Risk ID

mit74

Domain lineage

2. Privacy & Security

186 mapped risks

2.1 > Compromise of privacy by leaking or correctly inferring sensitive information

Mitigation strategy

1. **Implement Preemptive Data Hygiene and Minimization** Rigorously sanitize all training datasets through methods such as data redaction, pseudonymization, or full anonymization to eliminate Personally Identifiable Information (PII) and confidential data. A foundational strategy involves employing high-quality synthetic data generation to train the model, thereby circumventing the use of real sensitive data entirely and reducing the overall attack surface. 2. **Establish Robust Access and Environmental Security Controls** Enforce the Principle of Least Privilege (PoLP) and Role-Based Access Control (RBAC) to limit data access exclusively to authorized personnel and necessary system components. Furthermore, ensure end-to-end data encryption for all data at rest and in transit, and isolate the LLM inference environment within a secured, private network to prevent unauthorized external exposure. 3. **Deploy Multi-Stage Runtime Leakage Prevention** Integrate strict output filtering mechanisms and post-processing layers that scan the Large Language Model's (LLM) final responses for any unintended disclosure of sensitive information. This defense should be complemented by mandatory, secure prompt templates and query sanitization at the input stage to provide a protective guardrail against adversarial prompt injection attacks attempting to induce data leakage.