7. AI System Safety, Failures, & Limitations2 - Post-deployment

Performance & Robustness

The AI system's ability to fulfill its intended purpose and its resilience to perturbations, and unusual or adverse inputs. Failures of performance are fundamental to the AI system's correct functioning. Failures of robustness can lead to severe consequences.

Source: MIT AI Risk Repositorymit164

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit164

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.3 > Lack of capability or robustness

Mitigation strategy

1. Implement state-of-the-art **Adversarial Training** protocols during the model development phase to proactively minimize the adversarial loss and enhance the system's resilience against intentional manipulations and exploitation of latent vulnerabilities (e.g., prompt injections or data perturbations). 2. Establish a **Rigorous and Continuous Evaluation Framework** that incorporates stress testing, white-box/black-box testing, and human red-teaming to systematically quantify the system's robustness against unexpected inputs, out-of-distribution data, and worst-case attack vectors across its lifecycle. 3. Employ **Data Augmentation and Regularization** techniques (e.g., noise injection, varied transformations, L2 regularization) to expand the model's training distribution, mitigate overfitting, and thereby improve its generalization capability and stability when faced with noisy or unusual operational inputs.