4. Malicious Actors & Misuse2 - Post-deployment

AI's persuasive capabilities are misused to gain influence and promote harmful ideologies

As AI capabilities advance, they may be used to develop sophisticated persuasion tools, such as those that tailor their communication to specific users to persuade them of certain claims [42]. While these tools could be used for social good— such as New York Times’ chatbot that helps users to persuade people to get vaccinated against Covid-19 [27]—there are also many ways they could be misused by self-interested groups to gain influence and/or to promote harmful ideologies.

Source: MIT AI Risk Repositorymit902

ENTITY

1 - Human

INTENT

1 - Intentional

TIMING

2 - Post-deployment

Risk ID

mit902

Domain lineage

4. Malicious Actors & Misuse

223 mapped risks

4.1 > Disinformation, surveillance, and influence at scale

Mitigation strategy

1. Implement and continually stress-test robust, unbypassable safety guardrails within large language models to prevent the generation of highly persuasive, polarizing, or harmful content, specifically prioritizing resistance to adversarial attacks like jailbreaking that circumvent existing safeguards. 2. Mandate enhanced, pre-deployment safety evaluations that expand beyond measuring persuasion *success* to assess the model's *propensity* to generate persuasive *attempts* on ethically fraught topics, coupled with a regulatory requirement for content provenance via digital watermarking to improve the traceability of AI-generated media. 3. Invest in and integrate large-scale digital and media literacy programs into public education and awareness campaigns to cultivate cognitive resilience among citizens, thereby improving their capacity to critically evaluate information, recognize manipulated narratives, and resist AI-driven influence operations.