Evasion attack
Evasion attacks attempt to make a model output incorrect results by slightly perturbing the input data that is sent to the trained model.
ENTITY
1 - Human
INTENT
1 - Intentional
TIMING
2 - Post-deployment
Risk ID
mit1288
Domain lineage
2. Privacy & Security
2.2 > AI system security vulnerabilities and attacks
Mitigation strategy
1. Proactively implement Adversarial Training protocols, such as Projected Gradient Descent (PGD)-based methods, to enhance the model's intrinsic robustness by exposing it to adversarial examples during the training phase and thereby smoothing its decision boundaries. 2. Employ Input Transformation and Preprocessing techniques, such as Feature Squeezing or denoising, to neutralize the subtle, low-magnitude perturbations characteristic of evasion attacks before they reach the inference engine. 3. Integrate an Adversarial Example Detection and Rejection layer to monitor incoming data, leveraging methods like uncertainty estimation (e.g., predictive entropy) or feature-space anomaly detection to flag and discard inputs exhibiting anomalous characteristics.