Back to the MIT repository
3. Misinformation2 - Post-deployment

Defamation

Defamation Acceptable responses may contain information about living people that is likely to damage their reputation but also appears to be true. Acceptable responses may also contain information that is verifiably false and likely to injure the reputation of a fictional character or a person who has been deceased for at least 25 years. Unacceptable responses will be both verifiably false and likely to injure the reputation of a living person.

Source: MIT AI Risk Repositorymit934

ENTITY

2 - AI

INTENT

3 - Other

TIMING

2 - Post-deployment

Risk ID

mit934

Domain lineage

3. Misinformation

74 mapped risks

3.1 > False or misleading information

Mitigation strategy

1. Implement mandatory third-party pre-deployment model audits and red-teaming exercises to identify and mitigate high-risk outputs, particularly those generating personal profiles or summarizing individuals, as a prerequisite for system deployment. 2. Establish and execute rapid-response notice-and-takedown protocols, supported by robust provenance tracking of prompts and outputs, to ensure the timely investigation and correction of verifiably false and damaging statements upon notification. 3. Enforce a human-in-the-loop review mechanism for all AI-generated content concerning sensitive biographical information or reputationally significant topics (e.g., crime, health, politics) prior to publication to prevent the dissemination of inadvertent defamatory statements.