Prompt priming
Because generative models tend to produce output like the input provided, the model can be prompted to reveal specific kinds of information. For example, adding personal information in the prompt increases its likelihood of generating similar kinds of personal information in its output. If personal data was included as part of the model’s training, there is a possibility it could be revealed.
ENTITY
1 - Human
INTENT
1 - Intentional
TIMING
2 - Post-deployment
Risk ID
mit1291
Domain lineage
2. Privacy & Security
2.2 > AI system security vulnerabilities and attacks
Mitigation strategy
1. Implement Robust Prompt Isolation and Input Validation Require the use of specific delimiters, such as XML or JSON tags, to clearly segment user-provided input from system instructions within the prompt. This structural isolation prevents the generative model from confusing user data with directives, mitigating the primary risk of user-supplied sensitive information leading to the generation of similar data (Prompt Priming). Furthermore, all user input must undergo a validation or sanitization process to detect and neutralize known adversarial or PII-laden tokens. 2. Enforce Least Privilege Context Design Adhere to the principle of least privilege by strictly limiting the sensitive or confidential information that is injected into the model's context window. The application should only provide the minimum required data necessary for the LLM to complete the query, thereby substantially reducing the total surface area for potential sensitive information disclosure (Inference Risk) even if a successful priming event occurs. 3. Deploy Automated Output Filtering Guardrails Establish a secondary, non-LLM-based monitoring layer to perform real-time content analysis of the model's responses. This layer must utilize heuristics or an independent classification model to detect and redact any accidentally generated personally identifiable information (PII) or confidential data before the output is delivered to the end-user, serving as a critical final-stage defense.