7. AI System Safety, Failures, & Limitations1 - Pre-deployment

AI System bypassing a sandbox environment

An AI system may have the ability to bypass a sandboxed environment in which it is trained or evaluated.

Source: MIT AI Risk Repositorymit1164

ENTITY

2 - AI

INTENT

3 - Other

TIMING

1 - Pre-deployment

Risk ID

mit1164

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.2 > AI possessing dangerous capabilities

Mitigation strategy

1. **Implement Layered Defense-in-Depth Isolation** Establish a multi-layered execution environment, prioritizing (a) Process Isolation with strict resource limits, (b) Container or MicroVM Isolation with read-only filesystems and default-deny network access, and (c) Kernel-level System Call Filtering (e.g., seccomp-bpf) to ensure that a breach of a single layer results in containment rather than compromise of the host system. 2. **Mandate Strict Runtime Controls and Least-Privilege Policy** Deploy continuous runtime behavioral analytics and anomaly detection to identify unusual resource usage or execution patterns indicative of an attempted escape. Concurrently, enforce the principle of least privilege by restricting all network and host system access—including filesystem interaction and allowed Unix sockets—to the absolute minimum required for the AI system's function. 3. **Establish Formal Safeguard Bypass Disclosure Programmes** Institute a structured Safeguard Bypass Disclosure Programme (SBDP) or Bounty Programme (SBBP) to crowdsource adversarial testing, thereby enabling the discovery of novel and latent bypass vulnerabilities that are difficult to identify through internal review alone. This must be paired with an agile remediation and patching process for model updates.