2. Privacy & Security1 - Pre-deployment

Evasion Attacks

Evasion attacks [145] target to cause significant shifts in model’s prediction via adding perturbations in the test samples to build adversarial examples. In specific, the perturbations can be implemented based on word changes, gradients, etc.

Source: MIT AI Risk Repositorymit50

ENTITY

1 - Human

INTENT

1 - Intentional

TIMING

1 - Pre-deployment

Risk ID

mit50

Domain lineage

2. Privacy & Security

186 mapped risks

2.2 > AI system security vulnerabilities and attacks

Mitigation strategy

1. Implement Adversarial Training to substantially enhance model robustness by incorporating a diverse set of adversarial examples into the training regimen. 2. Apply Input Sanitization and Robust Feature Extraction techniques to preprocess incoming data, thereby mitigating the influence of subtle adversarial perturbations before they are processed by the model. 3. Employ architectural defenses such as Gradient Masking or Defensive Distillation to obfuscate the model's decision boundaries and hinder an adversary's ability to compute effective gradient-based adversarial perturbations.