7. AI System Safety, Failures, & Limitations2 - Post-deployment

Capabilities that could be used to reduce human control - Cyber offence

Instead of - or in addition to - manipulating humans, AI systems could acquire influence by exploiting vulnerabilities in computer systems. Offensive cyber capabilities could allow AI systems to gain access to money, computing resources, and critical infrastructure. As discussed earlier in this report, frontier AI is already lowering the barrier for threat actors and future AI agents may be able to execute cyber attacks autonomously.:

Source: MIT AI Risk Repositorymit1387

ENTITY

2 - AI

INTENT

1 - Intentional

TIMING

2 - Post-deployment

Risk ID

mit1387

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.2 > AI possessing dangerous capabilities

Mitigation strategy

1. Layered AI-Native Cyber Defense Implementation: Deploy advanced, multi-layered security architectures that integrate AI-driven threat detection, anomaly identification, and autonomous defensive systems. This necessitates real-time network traffic analysis, User and Entity Behavior Analytics (UEBA), and automated incident response capabilities to proactively counteract machine-speed, AI-enhanced attacks that exploit system vulnerabilities. 2. Autonomous System Governance and Control: Enforce rigorous AI system safety protocols, including stringent input validation, continuous runtime behavioral monitoring, and the application of least-privilege access policies to model endpoints. This mitigation is critical to prevent the AI system itself from escalating into an autonomous threat, ensuring that model queries and internal actions are treated with the highest security scrutiny. 3. Formalized AI-Specific Incident Response and Disaster Recovery: Develop and continually refine a comprehensive incident response plan that explicitly addresses AI-powered cyberattacks, encompassing immediate system isolation, breach analysis, and predefined communication strategies. Concurrently, maintain a formal disaster recovery plan with offsite backups and system reconstitution procedures to ensure operational continuity following a successful, autonomously executed compromise.