2. Privacy & Security1 - Pre-deployment

Association in LLMs

Association in LLMs refers to the capability to associate various pieces of information related to a person. According to [68], [86], given a pair of PII entities (xi , xj ), which is associated by a model F. Using a prompt p could force the model F to produce the entity xj , where p is the prompt related to the entity xi . For instance, an LLM could accurately output the answer when given the prompt “The email address of Alice is”, if the LLM associates Alice with her email “alice@email.com”. L

Source: MIT AI Risk Repositorymit34

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

1 - Pre-deployment

Risk ID

mit34

Domain lineage

2. Privacy & Security

186 mapped risks

2.1 > Compromise of privacy by leaking or correctly inferring sensitive information

Mitigation strategy

1. Prioritize **Data Sanitization and Reversible Anonymization** for Input Streams. Establish a mandatory, automated pipeline to sanitize and anonymize all Personally Identifiable Information (PII) before it is processed by the LLM. This includes utilizing techniques such as tokenization, masking, or format-preserving encryption to decouple associated entities within the data, thereby fundamentally undermining the model's ability to infer one entity (xj) from a related prompt (xi). 2. Integrate **Differential Privacy (DP)** during Model Training. Apply Differential Privacy techniques during the pre-training or fine-tuning of the LLM to introduce controlled statistical noise. This mathematically minimizes the influence of individual data points on the model's weights, which is a critical measure for mitigating the memorization and subsequent inferential linkage of disparate PII entities within the model architecture itself. 3. Enforce **Contextual Output Filtering and Guardrails**. Implement a dynamic, inference-time control layer that employs sophisticated content moderation tools to monitor and redact the LLM's generated output. This system should leverage contextual integrity principles and semantic analysis to actively block the disclosure of any inferred or explicitly associated sensitive information before it reaches the end-user, especially in multi-turn conversational or agentic applications.