Exploiting External Tools for Attacks
Adversarial tool providers can embed malicious instructions in the APIs or prompts [84], leading LLMs to leak memorized sensitive information in the training data or users’ prompts (CVE2023-32786). As a result, LLMs lack control over the output, resulting in sensitive information being disclosed to external tool providers. Besides, attackers can easily manipulate public data to launch targeted attacks, generating specific malicious outputs according to user inputs. Furthermore, feeding the information from external tools into LLMs may lead to injection attacks [61]. For example, unverified inputs may result in arbitrary code execution (CVE-2023-29374).
ENTITY
1 - Human
INTENT
1 - Intentional
TIMING
2 - Post-deployment
Risk ID
mit30
Domain lineage
2. Privacy & Security
2.2 > AI system security vulnerabilities and attacks
Mitigation strategy
- Prioritize rigorous runtime verification and sanitization of all inputs and outputs associated with external tools and APIs, specifically focusing on mitigating code injection vulnerabilities and preventing arbitrary code execution. - Implement granular Attribute Based Access Control (ABAC) to restrict the scope of data access and the specific tools an LLM can invoke, thereby limiting the potential for unauthorized information disclosure to adversarial tool providers. - Deploy Named Entity Recognition (NER) filtering or similar redaction mechanisms on both user prompts and LLM-generated content to minimize the exposure of memorized sensitive information or PII during interactions with external tools.