Back to the MIT repository
2. Privacy & Security3 - Other

Model Attacks

Model attacks exploit the vulnerabilities of LLMs, aiming to steal valuable information or lead to incorrect responses.

Source: MIT AI Risk Repositorymit44

ENTITY

1 - Human

INTENT

1 - Intentional

TIMING

3 - Other

Risk ID

mit44

Domain lineage

2. Privacy & Security

186 mapped risks

2.2 > AI system security vulnerabilities and attacks

Mitigation strategy

1. Implement rigorous input sanitization, validation, and filtering mechanisms, such as whitelist filtering and contextual guardrails, to preemptively mitigate the execution of adversarial and malicious prompts. 2. Establish a continuous security lifecycle integrating automated adversarial testing, prompt fuzzing, and specialized red-teaming exercises to proactively discover and address emergent model vulnerabilities. 3. Enforce the Principle of Least Privilege (PoLP) via robust Role-Based Access Controls (RBAC) to ensure the LLM agent's access to backend systems and sensitive resources is strictly limited to the minimum necessary for its functional requirements.