Back to the MIT repository
4. Malicious Actors & Misuse2 - Post-deployment

Cyber offense

Cyber risks, especially in the context of cyber offense, are an existing threat that may be exacerbated by AI. [108] demonstrated that teams of LLM agents can exploit zero-day vulnerabilities when given a description of the vulnerability and toy capture-the-flag problems. While cyber risks are not typically regarded as catastrophic, [3] argues that cyberwarfare is an underappreciated risk that poses a credible threat of catastrophic harm.

Source: MIT AI Risk Repositorymit1390

ENTITY

3 - Other

INTENT

1 - Intentional

TIMING

2 - Post-deployment

Risk ID

mit1390

Domain lineage

4. Malicious Actors & Misuse

223 mapped risks

4.2 > Cyberattacks, weapon development or use, and mass harm

Mitigation strategy

1. **Prioritize the deployment of specialized AI Firewalls and robust API Gateways** with rate-limiting and semantic intent analysis capabilities to act as a security layer for LLMs. This is critical for filtering malicious inputs (e.g., prompt injection) and preventing data or model exfiltration, directly mitigating the risk of LLM agents autonomously weaponizing vulnerabilities. 2. **Implement advanced, AI-driven real-time threat detection systems,** such as User and Entity Behavior Analytics (UEBA) and LLM-powered security solutions, for continuous monitoring of network activity. This capability is essential to rapidly identify and contain anomalous or unusual behavior indicative of zero-day exploitation, outpacing the velocity of AI-enhanced attacks. 3. **Establish and strictly enforce comprehensive AI Governance and Zero Trust security frameworks** for both AI models and critical infrastructure. This must encompass robust identity-centric security for non-human AI identities, immutable model versioning, and mandatory adversarial testing (AI Red Teaming) to proactively uncover and mitigate systemic vulnerabilities before deployment, addressing the potential for catastrophic cyberwarfare.