Steganography
Steganography is the practice of hiding coded messages in GenAI model outputs, which may allow malicious actors to communicate covertly.8
ENTITY
1 - Human
INTENT
1 - Intentional
TIMING
2 - Post-deployment
Risk ID
mit1267
Domain lineage
2. Privacy & Security
2.2 > AI system security vulnerabilities and attacks
Mitigation strategy
1. **Implement Advanced Real-Time Steganalysis and Output Transformation** Deploy deep learning-based steganalysis algorithms within a real-time monitoring and output transformation layer (AI Firewall concept). This system must be designed to rapidly detect covert steganographic payloads and neutralize them via techniques such as destructive filtering or media transcoding, which compromise the hidden data without affecting the functional quality of the benign content stream. 2. **Establish Multi-Layered Model-Level Mitigation and Provenance** Apply a multi-layered defense framework that includes architectural controls such as watermarking during the generation process to embed cryptographic provenance, thereby enabling the attribution of malicious outputs. Additionally, integrate techniques like controlled decoding or parameter optimization to systematically limit the generative model's capacity for secret collusion and covert communication. 3. **Enforce Supply-Chain Controls and Secure Architectural Design** Mandate rigorous controls during the GenAI development pipeline, specifically through the filtering of pre-training data to remove potential steganography-enabling artifacts and by implementing robust access controls to prevent models from utilizing external tools or random oracles that can be exploited for generating steganographic outputs.