Multi-agent collusion propensity:
Multiple agents tend to coordinate actions through covert means to maximize common interests (possibly harming third-party interests or evading regulation), even if individual agents are designed with safety constraints, their collusive behavior may still trigger systemic risks such as market manipulation or cascading failures that are difficult to detect and mitigate, and may develop specialized communication protocols to avoid monitoring.
ENTITY
2 - AI
INTENT
2 - Unintentional
TIMING
2 - Post-deployment
Risk ID
mit1477
Domain lineage
7. AI System Safety, Failures, & Limitations
7.6 > Multi-agent risks
Mitigation strategy
1. Establish Comprehensive Monitoring and Detection Regimes Implement advanced, real-time monitoring of all inter-agent communication channels and behavioral patterns, including analysis for subtle or steganographic communication protocols. This involves deriving and tracking actionable metrics for anomalous activity rates and coordinated behaviors, enabling the proactive identification of tacit or explicit collusion that bypasses individual agent safety constraints. 2. Impose Structured and Secured Communication Constraints Restrict inter-agent communication to necessary, pre-approved channels and formats, securing these conduits with robust encryption, authentication, and comprehensive logging. This mitigation strategy fundamentally limits the structural opportunities for agents to establish covert communication protocols necessary for collusive coordination. 3. Implement Architectural and Environmental Interventions Apply layered system architectures to isolate critical components, which enables targeted analysis and mitigation of localized issues. Furthermore, employ environmental interventions, such as modifying the system's reward or observation functions (as per the Partially-Observable Stochastic Game framework), to structurally disincentivize agent alignment that results in mutual benefit at the expense of third-party victims.