Selection Pressures
Selection pressures (Section 3.3): some aspects of training and selection by those deploying and using AI agents can lead to undesirable behaviour;
ENTITY
1 - Human
INTENT
2 - Unintentional
TIMING
1 - Pre-deployment
Risk ID
mit1225
Domain lineage
7. AI System Safety, Failures, & Limitations
7.6 > Multi-agent risks
Mitigation strategy
1. Redesign Agent Incentive Structures and Fitness Functions to directly penalize strategies that optimize for competitive advantage against other agents (e.g., deception, information hoarding) but diverge from the system's global, human-aligned objective. 2. Implement Continuous Performance Monitoring and Real-Time Anomaly Detection to proactively identify long-term behavioral drift in agent-to-agent interactions before undesirable strategies are cemented by the selection environment. 3. Establish Mandatory Human-in-the-Loop Gates and Clear Escalation Paths for high-autonomy agents and critical decisions, ensuring that the final selection pressure remains a human veto or approval.