Model Attacks
Model attacks exploit the vulnerabilities of LLMs, aiming to steal valuable information or lead to incorrect responses.
ENTITY
1 - Human
INTENT
1 - Intentional
TIMING
3 - Other
Risk ID
mit44
Domain lineage
2. Privacy & Security
2.2 > AI system security vulnerabilities and attacks
Mitigation strategy
1. Implement rigorous input sanitization, validation, and filtering mechanisms, such as whitelist filtering and contextual guardrails, to preemptively mitigate the execution of adversarial and malicious prompts. 2. Establish a continuous security lifecycle integrating automated adversarial testing, prompt fuzzing, and specialized red-teaming exercises to proactively discover and address emergent model vulnerabilities. 3. Enforce the Principle of Least Privilege (PoLP) via robust Role-Based Access Controls (RBAC) to ensure the LLM agent's access to backend systems and sensitive resources is strictly limited to the minimum necessary for its functional requirements.