4. Malicious Actors & Misuse2 - Post-deployment

Malicious Code Generation

Malicious code is a term for code—whether it be part of a script or embedded in a software system—designed to cause damage, security breaches, or other threats to application security. Advanced AI assistants with the ability to produce source code can potentially lower the barrier to entry for threat actors with limited programming abilities or technical skills to produce malicious code. Recently, a series of proof-of-concept attacks have shown how a benign-seeming executable file can be crafted such that, at every runtime, it makes application programming interface (API) calls to an AI assistant. Rather than just reproducing examples of already-written code snippets, the AI assistant can be prompted to generate dynamic, mutating versions of malicious code at each call, thus making the resulting vulnerability exploits difficult to detect by cybersecurity tools. Furthermore, advanced AI assistants could be used to create obfuscated code to make it more difficult for defensive cyber capabilities to detect and understand malicious activities. AI-generated code could also be quickly iterated to avoid being detected by traditional signature-based antivirus software. Finally, advanced AI assistants with source code capabilities have been found to be capable of assisting in the development of polymorphic malware that changes its behavior and digital footprint each time it is executed, making them hard to detect by antivirus programs that rely on known virus signatures. Taken together, without proper mitigation, advanced AI assistants can lower the barrier for developing malicious code, make cyberattacks more precise and tailored, further accelerate and automate cyber warfare, enable stealthier and more persistent offensive cyber capabilities, and make cyber campaigns more effective on a larger scale.

Source: MIT AI Risk Repositorymit380

ENTITY

1 - Human

INTENT

1 - Intentional

TIMING

2 - Post-deployment

Risk ID

mit380

Domain lineage

4. Malicious Actors & Misuse

223 mapped risks

4.2 > Cyberattacks, weapon development or use, and mass harm

Mitigation strategy

1. Deploy advanced, non-signature-based threat detection mechanisms, specifically integrating Behavioral Analysis, Machine Learning/AI-driven models, and Network Detection and Response (NDR) capabilities to identify the dynamic and polymorphic variants of AI-generated malicious code. 2. Implement rigorous Secure Software Development Lifecycle (SSDLC) practices, including mandatory API security controls and the application of LLM-specific defensive techniques (e.g., input sanitization, adversarial training, and watermarking) to mitigate the model's ability to generate exploit code. 3. Establish a Zero Trust architecture, strictly enforcing the Principle of Least Privilege (PoLP) and leveraging network segmentation to limit the lateral movement, propagation, and potential damage caused by stealthy, persistent, and dynamically-mutating malware.