Malicious intent
A frequent malicious use case of generative AI to harm, humiliate, or sexualize another person involves generating deepfakes of nonconsensual sexual imagery or videos.
ENTITY
1 - Human
INTENT
1 - Intentional
TIMING
2 - Post-deployment
Risk ID
mit517
Domain lineage
4. Malicious Actors & Misuse
4.3 > Fraud, scams, and targeted manipulation
Mitigation strategy
- Implement robust technical safeguards, including strict semantic **guardrails** on both input prompts and output content to preemptively block the generation of Non-Consensual Intimate Imagery (NCII) and Child Sexual Abuse Material (CSAM). Furthermore, mandate the integration of content provenance standards, such as **C2PA**, to embed verifiable metadata and watermarks, establishing the authenticity and origin of all generated media to facilitate traceability. - Establish and enforce a rapid **notice-and-removal process** across all hosting platforms, requiring the takedown of verified NCII deepfakes within a strict timeline (e.g., 48 hours of notification). This mechanism must be underpinned by proactive technological solutions, such as **image hashing** (e.g., StopNCII.org), to efficiently detect and prevent the re-sharing of known nonconsensual content. - Institute stringent **ethical and governance frameworks** for generative AI developers, prioritizing **consent** from individuals whose likeness is used. This includes maintaining comprehensive **transparency** regarding deepfake generation methodologies and establishing clear lines of **accountability**, including human oversight, for mitigating the misuse of AI systems for harm, harassment, or extortion.