7. AI System Safety, Failures, & Limitations2 - Post-deployment

Performative utterances

The chatbot makes a deal, commitment, or other consequential action with its output that the deployer did not intend.

Source: MIT AI Risk Repositorymit1401

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit1401

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.3 > Lack of capability or robustness

Mitigation strategy

1. Implement a **Constrained Decoding and Output Sanitization** framework to programmatically prevent the generation of utterances that align with pre-defined performative verbs or unauthorized actions (e.g., 'commit,' 'agree,' 'guarantee') by the AI system. 2. Establish a **Formal AI Governance Framework** that explicitly defines the system's operational boundaries and lack of authority to enter contracts or make binding commitments, ensuring this policy is reflected in the system prompt and user-facing disclaimers. 3. Mandate a **Human-in-the-Loop Vetting Process** for all mission-critical or consequential AI-generated outputs, requiring mandatory human fact-checking and ratification before any potentially performative statement is acted upon or disseminated.