4. Malicious Actors & Misuse2 - Post-deployment

Widespread use of persuasion tools

Widespread use of AI-powered persuasion tools could lead to systemic harm

Source: MIT AI Risk Repositorymit1092

ENTITY

1 - Human

INTENT

3 - Other

TIMING

2 - Post-deployment

Risk ID

mit1092

Domain lineage

4. Malicious Actors & Misuse

223 mapped risks

4.1 > Disinformation, surveillance, and influence at scale

Mitigation strategy

1. Establish and Enforce Regulatory Prohibitions and Liability Mandate and enforce comprehensive regulatory frameworks, such as the principles outlined in the EU AI Act, that explicitly prohibit the deployment of AI systems leveraging manipulative or deceptive techniques (e.g., exploitation of cognitive vulnerabilities, subliminal messaging) to materially distort human behavior. This must be coupled with clear legal and organizational liability mechanisms for developers and deployers to ensure accountability for systemic harms resulting from persuasive misuse. 2. Implement Advanced Technical Safety and Evasion Defenses Require rigorous pre-deployment safety evaluations and red-teaming to test model resilience against persuasion-based adversarial attacks, including jailbreaking and persuasive prompting. Furthermore, mandate the implementation of continuous, adaptive technical safeguards—such as advanced system prompts and real-time context summarization—to effectively inhibit the generation and delivery of manipulative content, with a focus on preventing the reinforcement of user biases or the enabling of dangerous behaviors. 3. Enhance Transparency and Promote Cognitive Resilience Require mandatory disclosure and provenance tracking for AI-generated persuasive content across all high-stakes public domains (e.g., political discourse, health, finance) to reduce algorithmic opacity. This should be supported by large-scale, research-informed public education campaigns focused on bolstering digital and media literacy to equip citizens with the critical thinking skills necessary to recognize and resist advanced, personalized AI-driven influence and manipulation attempts.