Generation of personalized content for harassment, extortion, or intimidation
GPAIs can be misused for the automated generation of content personalized to target select individuals based on their weak spots [30]. Such attacks may be more efficient and more successful in achieving the goals of harassment, extortion, or intimidation.
ENTITY
1 - Human
INTENT
1 - Intentional
TIMING
2 - Post-deployment
Risk ID
mit1181
Domain lineage
4. Malicious Actors & Misuse
4.3 > Fraud, scams, and targeted manipulation
Mitigation strategy
1. **Technical Content Filtering and Refusal Training** Implement rigorous technical safeguards, including input and output filtering mechanisms and extensive refusal training, throughout the General-Purpose AI (GPAI) model lifecycle to proactively constrain its ability to generate content that aligns with known patterns of harassment, extortion, or intimidation. 2. **Data Minimization and Digital Footprint Reduction** Adopt a comprehensive data minimization strategy across the organization and encourage all personnel to decrease the online availability of sensitive personal information. This process limits the necessary input data for creating highly personalized and effective malicious content, thereby reducing the risk surface for targeted manipulation. 3. **Formal Incident Response and Policy Updates** Establish and formally document clear, pre-defined protocols for the timely investigation and response to incidents of synthetic or personalized digital harassment. Simultaneously, audit and update all relevant organizational policies (e.g., harassment, acceptable use) to explicitly define and prohibit the creation or circulation of malicious AI-generated content.