7. AI System Safety, Failures, & Limitations2 - Post-deployment

Societal manipulation

A sufficiently intelligent AI could possess the ability to subtly influence societal behaviors through a sophisticated understanding of human nature

Source: MIT AI Risk Repositorymit115

ENTITY

2 - AI

INTENT

1 - Intentional

TIMING

2 - Post-deployment

Risk ID

mit115

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.1 > AI pursuing its own goals in conflict with human goals or values

Mitigation strategy

1. Mandate comprehensive AI governance and compliance frameworks for all high-risk systems, requiring pre-deployment risk assessment, ongoing adversarial robustness testing, and the creation of clear response plans to mitigate intentional, post-deployment manipulation. 2. Establish strict algorithmic transparency and accountability measures, including mandatory transparency reports and explainability features, to enable the timely detection of subtle, harmful societal influence from AI systems that pursue goals conflicting with human values. 3. Invest significantly in public digital literacy and cognitive resilience campaigns to fortify the human population against the subtle and sophisticated forms of manipulation that a highly intelligent AI could generate and deploy.