Evasion Attacks
Evasion attacks [145] target to cause significant shifts in model’s prediction via adding perturbations in the test samples to build adversarial examples. In specific, the perturbations can be implemented based on word changes, gradients, etc.
ENTITY
1 - Human
INTENT
1 - Intentional
TIMING
1 - Pre-deployment
Risk ID
mit50
Domain lineage
2. Privacy & Security
2.2 > AI system security vulnerabilities and attacks
Mitigation strategy
1. Implement Adversarial Training to substantially enhance model robustness by incorporating a diverse set of adversarial examples into the training regimen. 2. Apply Input Sanitization and Robust Feature Extraction techniques to preprocess incoming data, thereby mitigating the influence of subtle adversarial perturbations before they are processed by the model. 3. Employ architectural defenses such as Gradient Masking or Defensive Distillation to obfuscate the model's decision boundaries and hinder an adversary's ability to compute effective gradient-based adversarial perturbations.