Unhelpful Uses
Improper uses of LLM systems can cause adverse social impacts.
ENTITY
1 - Human
INTENT
1 - Intentional
TIMING
2 - Post-deployment
Risk ID
mit14
Domain lineage
4. Malicious Actors & Misuse
4.3 > Fraud, scams, and targeted manipulation
Mitigation strategy
1. Implement a layered defense framework incorporating continuous input sanitization and adversarial filtering to block malicious instructions (prompt injection, jailbreaking) and screen for the generation of toxic, harmful, or fraudulent content. 2. Establish rigorous system-level security and access controls, including strong user authentication and real-time usage monitoring (rate limiting) to prevent unauthorized access, model exfiltration, and resource exhaustion attacks. 3. Institute a mandatory and recurring red-teaming program using diverse adversarial prompting techniques to proactively identify and mitigate emerging security vulnerabilities and bypasses within the LLM's safety alignment layer.