Distributional Shift
Distributional Shift. Individual ML systems can perform poorly in contexts different from those in which they were trained. A key source of these distributional shifts is the actions and adaptations of other agents (Narang et al., 2023; Papoudakis et al., 2019; Piliouras & Yu, 2022), which in single-agent approaches are often simply or ignored or at best modelled exogenously. Indeed, the sheer number and variance of behaviours that can be exhibited other agents means that multi-agent systems pose an especially challenging generalisation problem for individual learners (Agapiou et al., 2022; Leibo et al., 2021; Stone et al., 2010). While distributional shifts can cause issues in common-interest settings (see Section 2.1), they are more worrisome in mixed-motive settings since the ability of agents to cooperate depends not only on the ability to coordinate on one of many arbitrary conventions (which might be easily resolved by a common language), but on their beliefs about what solutions other agents will find acceptable
ENTITY
2 - AI
INTENT
2 - Unintentional
TIMING
2 - Post-deployment
Risk ID
mit1234
Domain lineage
7. AI System Safety, Failures, & Limitations
7.6 > Multi-agent risks
Mitigation strategy
1. Establish Formal Inter-Agent Coordination and Trust Protocols: Design and enforce explicit, secure communication and coordination protocols between agents. This aims to stabilize the system's dynamics by enhancing the transparency and predictability of other agents' actions and beliefs, which is crucial for cooperation and reducing the magnitude of behavioral-induced distributional shifts, particularly in mixed-motive settings. 2. Develop Robust Generalization and Adaptation Strategies: Implement advanced training techniques to improve the individual agents' generalization capabilities against unseen or shifting distributions. This includes leveraging self-supervised pretraining to extract shift-invariant features, and employing test-time refinement methods, such as utilizing cheap priors or auxiliary objectives, to adapt model representations to out-of-distribution inputs at minimal computational cost. 3. Institute Continuous Monitoring and Out-of-Distribution (OOD) Detection: Deploy real-time diagnostics to detect when the multi-agent system is operating in a novel, out-of-distribution regime (e.g., monitoring shifts in input feature distributions or predicted force norms). A robust detection system allows for the immediate triggering of a pre-defined remediation strategy, such as agent recalibration, model updating with recent data, or escalation to human oversight.