Harassment, Impersonation, and Extortion
Deepfakes and other AI-generated content can be used to facilitate or exacerbate many of the harms listed throughout this report, but this section focuses on one subset: intentional, targeted abuse of individuals.
ENTITY
1 - Human
INTENT
1 - Intentional
TIMING
2 - Post-deployment
Risk ID
mit516
Domain lineage
4. Malicious Actors & Misuse
4.3 > Fraud, scams, and targeted manipulation
Mitigation strategy
1. Mandate the implementation of content provenance and digital watermarking standards, such as those promoted by the C2PA, in all generative AI systems during content creation, ensuring the embedding of robust, imperceptible metadata for source tracing and forensic authentication. 2. Establish multi-layered semantic guardrails and filtering mechanisms on both user input and model output within generative AI platforms to proactively detect and prevent the creation of violative content, including non-consensual intimate imagery and material designed for targeted harassment. 3. Develop and deploy advanced active authentication and liveness detection technologies that are rigorously tested against simulated deepfake attacks to mitigate identity impersonation and fraud during critical processes like user onboarding and biometric verification.