Back to the MIT repository
4. Malicious Actors & Misuse2 - Post-deployment

Misuse risks

Frontier AI may help bad actors to perform cyberattacks, run disinformation campaigns and design biological or chemical weapons. Frontier AI will almost certainly continue to lower the barriers to entry for less sophisticated threat actors.192 We focus here on only a few important misuse risks, but this is not to downplay the importance of others.

Source: MIT AI Risk Repositorymit1379

ENTITY

1 - Human

INTENT

1 - Intentional

TIMING

2 - Post-deployment

Risk ID

mit1379

Domain lineage

4. Malicious Actors & Misuse

223 mapped risks

4.0 > Malicious use

Mitigation strategy

1. Strict Access and Operational Security Control: Implement robust **Access Control** policies, including user verification (Know-Your-Customer) and compute monitoring, alongside a **Defense-in-Depth** security architecture to govern who can utilize the model's capabilities and to protect the system from unauthorized access or theft by malicious actors. 2. Pre-deployment Capability Elicitation and Targeted Mitigation: Systematically conduct **Capability Elicitation** evaluations against predefined **Critical Capability Levels (CCLs)** (e.g., for cyberoffense or biological assistance) to detect high-risk functionality. Subsequently, apply **Targeted Unlearning** or **Safety Fine-Tuning (SFT)** (refusal training) to remove or substantially minimize the model's ability to assist in high-consequence misuse cases. 3. Policy-Driven Deployment Halting and Accountability: Institute non-negotiable **Conditions for Halting Deployment Plans**, requiring the suspension of deployment if unmitigated capabilities posing severe or catastrophic harm are detected. This must be complemented by transparent, independent **Accountability** and external oversight mechanisms to ensure rigorous and consistent execution of the safety policy.