7. AI System Safety, Failures, & Limitations2 - Post-deployment

Incompatible strategies

Incompatible Strategies. Even if all agents can perform well in isolation, miscoordination can still occur due to the agents choosing incompatible strategies (Cooper et al., 1990). Competitive (i.e., two- player zero-sum) settings allow designers to produce agents that are maximally capable without taking other players into account. Crucially, this is possible because playing a strategy at equilibrium in the zero-sum setting guarantees a certain payoff, even if other players deviate from the equilibrium (Nash, 1951). On the other hand, common-interest (and mixed-motive) settings often allow a vast number of mutually incompatible solutions (Schelling, 1980), which is worsened in partially observable environments (Bernstein et al., 2002; Reif, 1984).

Source: MIT AI Risk Repositorymit1207

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit1207

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.6 > Multi-agent risks

Mitigation strategy

1. Prioritize design-time implementation of **Schelling Points (Focal Points)** within the agents' decision-making processes to guarantee convergence upon a mutually obvious and accepted equilibrium solution for common-interest and coordination problems, even without pre-communication. 2. Enforce stringent **Objective Alignment and Incentive Structures** by leveraging peer incentivisation methods and engineering agents with identical or highly constrained preferences over shared system outcomes to prevent the emergence of strategically incompatible objectives. 3. Implement **Robust Communication and State Reconciliation Protocols** to ensure consistent, shared context and belief reconciliation mechanisms across all agents, mitigating the miscoordination risk exacerbated by partial observability and distributed state management challenges.