4. Malicious Actors & Misuse2 - Post-deployment

Harassment, Impersonation, and Extortion

Deepfakes and other AI-generated content can be used to facilitate or exacerbate many of the harms listed throughout this report, but this section focuses on one subset: intentional, targeted abuse of individuals.

Source: MIT AI Risk Repositorymit516

ENTITY

1 - Human

INTENT

1 - Intentional

TIMING

2 - Post-deployment

Risk ID

mit516

Domain lineage

4. Malicious Actors & Misuse

223 mapped risks

4.3 > Fraud, scams, and targeted manipulation

Mitigation strategy

1. Mandate the implementation of content provenance and digital watermarking standards, such as those promoted by the C2PA, in all generative AI systems during content creation, ensuring the embedding of robust, imperceptible metadata for source tracing and forensic authentication. 2. Establish multi-layered semantic guardrails and filtering mechanisms on both user input and model output within generative AI platforms to proactively detect and prevent the creation of violative content, including non-consensual intimate imagery and material designed for targeted harassment. 3. Develop and deploy advanced active authentication and liveness detection technologies that are rigorously tested against simulated deepfake attacks to mitigate identity impersonation and fraud during critical processes like user onboarding and biometric verification.