7. AI System Safety, Failures, & Limitations2 - Post-deployment

Social Dilemmas

Social Dilemmas. As noted in our definition, conflict can arise in any situation in which selfish incentives diverge from the collective good, known as a social dilemma (Dawes & Messick, 2000; Hardin, 1968; Kollock, 1998; Ostrom, 1990). While this is by no means a modern problem, advances in AI could further enable actors to pursue their selfish incentives by overcoming the technical, legal, or social barriers that standardly help to prevent this. To take a plausible, near-term (if very low-stakes) example, an automated AI assistant could easily reserve a table at every restaurant in town in minutes, enabling the user to decide later and cancel all other reservations

Source: MIT AI Risk Repositorymit1211

ENTITY

2 - AI

INTENT

1 - Intentional

TIMING

2 - Post-deployment

Risk ID

mit1211

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.6 > Multi-agent risks

Mitigation strategy

1. Implement Formal Contracting and Incentive Modification mechanisms that enable autonomous agents to negotiate binding, zero-sum modifications to their objective functions, thereby mathematically aligning individual rationality with maximal social welfare in shared resource environments. 2. Deploy rigorous, policy-based trust and access controls within multi-agent systems to manage and verify the legitimacy of autonomous agents, ensuring real-time enforcement to prevent the exploitation of technical or social barriers for self-interested resource depletion. 3. Integrate explicit social preferences or "other-regarding" utility functions into agent design to ensure the intrinsic valuation of collective well-being, which fundamentally addresses the divergence between self-interest and the group's benefit in strategic interactions.