4. Malicious Actors & Misuse2 - Post-deployment

Abuse & Misuse

The potential for AI systems to be used maliciously or irresponsibly, including for creating deepfakes, automated cyber attacks, or invasive surveillance systems. Specifically denotes intentional use of AI for harm.

Source: MIT AI Risk Repositorymit158

ENTITY

1 - Human

INTENT

1 - Intentional

TIMING

2 - Post-deployment

Risk ID

mit158

Domain lineage

4. Malicious Actors & Misuse

223 mapped risks

4.2 > Cyberattacks, weapon development or use, and mass harm

Mitigation strategy

1. Establish an AI Governance and Security Framework Implement a formal AI governance strategy encompassing secure-by-design principles, policies, and processes across the entire AI lifecycle. This foundational step requires a secure implementation approach, the safeguarding of all training data, continuous auditing of data quality and provenance, and the enforcement of least-privilege access and zero-trust security controls to mitigate insider risk and unauthorized exploitation. 2. Conduct Continuous Adversarial Testing and Risk Assessment Integrate repeatable and comprehensive AI risk assessments, threat modeling, and adversarial testing (AI Red Teaming) into the development and post-deployment stages. This proactive measure is necessary to identify model vulnerabilities, test resilience against evolving attack vectors such as prompt injection and data poisoning, and ensure the system maintains predictable behavior under real-world pressure. 3. Deploy Multi-Layered Detection and Human Oversight Implement specific technical and procedural controls to detect and counteract malicious outputs like deepfakes and disinformation. This includes deploying specialized AI-powered detection tools, utilizing digital provenance technologies (such as watermarking) to verify content authenticity, and establishing human oversight and validation mechanisms to review and correct high-risk or potentially harmful AI outputs.