7. AI System Safety, Failures, & Limitations2 - Post-deployment

Critical infrastructure component failures when integrated with AI systems

When relying on GPAI in critical infrastructure, there may be common mode failures that begin with vulnerabilities or robustness issues in the underlying model architecture or training setup. These failures may happen accidentally (in edge-cases) or due to adversarial inputs to the AI systems [58].

Source: MIT AI Risk Repositorymit1171

ENTITY

2 - AI

INTENT

3 - Other

TIMING

2 - Post-deployment

Risk ID

mit1171

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.3 > Lack of capability or robustness

Mitigation strategy

1. Mandate comprehensive adversarial testing (red-teaming) and continuous model evaluation against attack vectors, ensuring the General-Purpose AI (GPAI) architecture and its integrations are resilient against vulnerabilities, sophisticated adversarial inputs, and potential common mode failures throughout the system lifecycle. 2. Implement a multi-layered security architecture that includes stringent input validation, sanitization, and continuous, real-time behavioral monitoring (e.g., statistical anomaly detection) to immediately detect and filter malicious or anomalous inputs before they propagate to critical physical components. 3. Establish robust human-AI joint decision-making protocols and system resilience measures, ensuring human agency and oversight are preserved through transparent AI agents (explainability) and providing automated mechanisms for fail-safes, rollback options, and incident escalation paths to manage post-deployment failures.