2. Privacy & Security2 - Post-deployment

Security

Artificial intelligence comes with an intrinsic set of challenges that need to be considered when discussing trustworthiness, especially in the context of functional safety. AI models, especially those with higher complexities (such as neural networks), can exhibit specific weaknesses not found in other types of systems and must, therefore, be subjected to higher levels of scrutiny, especially when deployed in a safety-critical context

Source: MIT AI Risk Repositorymit184

ENTITY

3 - Other

INTENT

3 - Other

TIMING

2 - Post-deployment

Risk ID

mit184

Domain lineage

2. Privacy & Security

186 mapped risks

2.2 > AI system security vulnerabilities and attacks

Mitigation strategy

- Implement Adversarial Training and Robustness Testing protocols to proactively enhance model resilience by exposing the system to crafted adversarial examples during the training phase, thereby improving generalization and preventing exploitation by known and novel evasion attack methodologies. - Establish Robust Input Validation and Preprocessing pipelines to act as a critical boundary layer, utilizing statistical anomaly detection and feature sanitization techniques (e.g., feature squeezing) to detect and neutralize subtle, malicious perturbations in input data before they reach the core AI inference engine. - Deploy Continuous Real-Time Model Behavioral Monitoring systems to detect deviations from established performance baselines and identify anomalous query patterns, enabling rapid incident response to ongoing attacks such as model extraction attempts or integrity compromises in a post-deployment environment.

ADDITIONAL EVIDENCE

One class of attacks on AI systems in particular has recently garnered interest: adversarial machine learning. Here, an attacker tries to manipulate an AI model to either cause it to malfunction, change the expected model output or obtain information about the model that would otherwise not be available to them