Poisoning Attacks
Poisoning attacks [143] could influence the behavior of the model by making small changes to the training data. A number of efforts could even leverage data poisoning techniques to implant hidden triggers into models during the training process (i.e., backdoor attacks). Many kinds of triggers in text corpora (e.g., characters, words, sentences, and syntax) could be used by the attackers.
ENTITY
1 - Human
INTENT
1 - Intentional
TIMING
1 - Pre-deployment
Risk ID
mit47
Domain lineage
2. Privacy & Security
2.2 > AI system security vulnerabilities and attacks
Mitigation strategy
1. Implement robust, automated data validation pipelines, incorporating statistical anomaly detection (e.g., outlier detection, clustering algorithms) and redundant dataset checks to verify the integrity and authenticity of training data prior to model ingestion. 2. Employ Adversarial Training to expose the model to simulated poisoned examples during development, thereby enhancing its inherent resilience, and establish continuous, real-time monitoring of training metrics and output baselines to detect anomalous shifts in model precision or performance early. 3. Enforce strict, least-privilege access control policies (e.g., RBAC) and implement multi-factor authentication for all data sources and modification privileges to minimize the attack surface and potential entry points for malicious data injection.