Misuse Risks
Risks arising from intentional exploitation of AI model capabilities by malicious actors to cause harm to individuals, organisations, or society.
ENTITY
1 - Human
INTENT
1 - Intentional
TIMING
2 - Post-deployment
Risk ID
mit1444
Domain lineage
4. Malicious Actors & Misuse
4.0 > Malicious use
Mitigation strategy
1. Implement Robust, Layered Input and Output Defenses: Systematically constrain the AI model's operational scope by defining strict context adherence and validating expected output formats. This includes deploying multi-layered agent networks and using semantic and string-checking filters on all inputs and outputs to prevent direct and indirect prompt injection attacks, thereby fortifying the model against unauthorized command overrides. 2. Enforce Principle of Least Privilege and Human Oversight: Mandate strict Role-Based Access Control (RBAC) to limit employee and model access privileges to the minimum necessary for intended operations. For high-risk actions or critical decision-making, implement mandatory Human-in-the-Loop (HITL) controls to ensure human review and approval, mitigating the potential for unauthorized actions resulting from successful model manipulation. 3. Conduct Proactive Adversarial Testing and Red Teaming: Continuously assess the AI system's vulnerabilities against evolving threat landscapes by performing ethical hacking, penetration testing, and structured red teaming exercises. This is critical for identifying and preemptively addressing potential malicious exploitation vectors, particularly those targeting model behavior and output generation.