Tool utilization propensity
propensity to actively seek, acquire and utilize various tools to expand its own capability boundaries, particularly those that can enhance its ability to interact with the physical world or improve autonomy, may use tools in innovative combinations to achieve functions beyond expectations.
ENTITY
2 - AI
INTENT
1 - Intentional
TIMING
2 - Post-deployment
Risk ID
mit1479
Domain lineage
7. AI System Safety, Failures, & Limitations
7.2 > AI possessing dangerous capabilities
Mitigation strategy
1. Implement mandatory architectural controls, such as strict sandboxing and permission settings, to narrowly limit the model's access to and execution of external APIs, operating system commands, and internet-connected tools, particularly those that enable unauthorized interaction with the physical world. 2. Establish real-time, comprehensive monitoring, logging, and auditing of all tool-use attempts and system calls. This system must be designed to detect unauthorized, anomalous, or unexpected utilization patterns and trigger immediate human-in-the-loop intervention for any deviation from approved tool-use protocols. 3. Conduct proactive, continuous Red Teaming and scenario planning exercises to rigorously test the model's latent capacity to combine available tools in novel and unanticipated ways. This process should systematically attempt to expand the model's autonomy or physical world influence in order to discover and mitigate dangerous propensities prior to deployment.