Back to the MIT repository
4. Malicious Actors & Misuse2 - Post-deployment

Generation of personalized content for harassment, extortion, or intimidation

GPAIs can be misused for the automated generation of content personalized to target select individuals based on their weak spots [30]. Such attacks may be more efficient and more successful in achieving the goals of harassment, extortion, or intimidation.

Source: MIT AI Risk Repositorymit1181

ENTITY

1 - Human

INTENT

1 - Intentional

TIMING

2 - Post-deployment

Risk ID

mit1181

Domain lineage

4. Malicious Actors & Misuse

223 mapped risks

4.3 > Fraud, scams, and targeted manipulation

Mitigation strategy

1. **Technical Content Filtering and Refusal Training** Implement rigorous technical safeguards, including input and output filtering mechanisms and extensive refusal training, throughout the General-Purpose AI (GPAI) model lifecycle to proactively constrain its ability to generate content that aligns with known patterns of harassment, extortion, or intimidation. 2. **Data Minimization and Digital Footprint Reduction** Adopt a comprehensive data minimization strategy across the organization and encourage all personnel to decrease the online availability of sensitive personal information. This process limits the necessary input data for creating highly personalized and effective malicious content, thereby reducing the risk surface for targeted manipulation. 3. **Formal Incident Response and Policy Updates** Establish and formally document clear, pre-defined protocols for the timely investigation and response to incidents of synthetic or personalized digital harassment. Simultaneously, audit and update all relevant organizational policies (e.g., harassment, acceptable use) to explicitly define and prohibit the creation or circulation of malicious AI-generated content.