Capabilities that could be used to reduce human control - Cyber offence
Instead of - or in addition to - manipulating humans, AI systems could acquire influence by exploiting vulnerabilities in computer systems. Offensive cyber capabilities could allow AI systems to gain access to money, computing resources, and critical infrastructure. As discussed earlier in this report, frontier AI is already lowering the barrier for threat actors and future AI agents may be able to execute cyber attacks autonomously.:
ENTITY
2 - AI
INTENT
1 - Intentional
TIMING
2 - Post-deployment
Risk ID
mit1387
Domain lineage
7. AI System Safety, Failures, & Limitations
7.2 > AI possessing dangerous capabilities
Mitigation strategy
1. Layered AI-Native Cyber Defense Implementation: Deploy advanced, multi-layered security architectures that integrate AI-driven threat detection, anomaly identification, and autonomous defensive systems. This necessitates real-time network traffic analysis, User and Entity Behavior Analytics (UEBA), and automated incident response capabilities to proactively counteract machine-speed, AI-enhanced attacks that exploit system vulnerabilities. 2. Autonomous System Governance and Control: Enforce rigorous AI system safety protocols, including stringent input validation, continuous runtime behavioral monitoring, and the application of least-privilege access policies to model endpoints. This mitigation is critical to prevent the AI system itself from escalating into an autonomous threat, ensuring that model queries and internal actions are treated with the highest security scrutiny. 3. Formalized AI-Specific Incident Response and Disaster Recovery: Develop and continually refine a comprehensive incident response plan that explicitly addresses AI-powered cyberattacks, encompassing immediate system isolation, breach analysis, and predefined communication strategies. Concurrently, maintain a formal disaster recovery plan with offsite backups and system reconstitution procedures to ensure operational continuity following a successful, autonomously executed compromise.