7. AI System Safety, Failures, & Limitations2 - Post-deployment

Miscoordination

Miscoordination arises when agents, despite a mutual and clear objective, cannot align their behaviours to achieve this objective. Unlike the case of differing objectives, in common-interest settings there is a more easily well-defined notion of ‘optimal’ behaviour and we describe agents as miscoordinating to the extent that they fall short of this optimum. Note that for common-interest settings it is not sufficient for agents’ objectives to be the same in the sense of being symmetric (e.g., when two agents both want the same prize, but only one can win). Rather, agents must have identical preferences over outcomes (e.g., when two agents are on the same team and win a prize as a team or not at all).

Source: MIT AI Risk Repositorymit1206

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit1206

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.6 > Multi-agent risks

Mitigation strategy

1. Prioritize sequential decision-making protocols. Implement multi-agent reinforcement learning architectures that enforce an explicit, sequential action selection mechanism, where agents condition their policy on the real-time actions and observations of peer agents to stabilize learning and enforce optimal collective behavior. 2. Develop and integrate an explicit Theory of Mind (ToM) module. Implement inter-agent modeling capabilities that allow agents to generate accurate, dynamic representations of their partners' beliefs, goals, and knowledge, thereby reducing coordination failures stemming from informational asymmetry or misinterpretation. 3. Mandate rigorous simulation-based stress testing. Conduct comprehensive pre-deployment and continuous testing that explicitly simulates coordination breakdowns and conflicting inputs. Utilize failure cascade modeling to identify and isolate weak points in inter-agent communication and system resilience.