Back to the MIT repository
4. Malicious Actors & Misuse2 - Post-deployment

Propaganda

LLMs can be leveraged, by malicious users, to proactively generate propaganda information that can facilitate the spreading of a target

Source: MIT AI Risk Repositorymit493

ENTITY

1 - Human

INTENT

1 - Intentional

TIMING

2 - Post-deployment

Risk ID

mit493

Domain lineage

4. Malicious Actors & Misuse

223 mapped risks

4.1 > Disinformation, surveillance, and influence at scale

Mitigation strategy

1. Implement robust input validation and output filtering mechanisms (e.g., toxicity detectors, content filters) across the LLM lifecycle to proactively block or sanitize user prompts requesting propaganda generation and to prevent the dissemination of high-risk, harmful, or biased content generated by the model 2. Establish a comprehensive AI governance framework that defines acceptable use policies, assigns clear accountability for misuse, and mandates regular audits of the AI system's compliance and ethical performance to prevent and address the leveraging of LLMs for malicious purposes 3. Deploy specialized real-time threat detection and monitoring systems to identify anomalous usage patterns (e.g., sudden spikes in request volume, repeated adversarial queries) that may indicate automated abuse or large-scale generation of propaganda, enabling rapid incident response and mitigation

ADDITIONAL EVIDENCE

Generating propaganda against targeted people (e.g. celebrities): Figure 18. • Advocating for terrorism: Figure 19. • Creating extreme and harmful political propaganda