3. Misinformation2 - Post-deployment

Causing material harm by disseminating false or poor information

Poor or false LM predictions can indirectly cause material harm. Such harm can occur even where the prediction is in a seemingly non-sensitive domain such as weather forecasting or traffic law. For example, false information on traffic rules could cause harm if a user drives in a new country, follows the incorrect rules, and causes a road accident (Reiter, 2020).

Source: MIT AI Risk Repositorymit242

ENTITY

2 - AI

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit242

Domain lineage

3. Misinformation

74 mapped risks

3.1 > False or misleading information

Mitigation strategy

1. Implement advanced data and knowledge grounding architectures, such as Retrieval Augmented Generation (RAG), to anchor Large Language Model (LLM) responses to up-to-date and verifiable external data sources, thereby minimizing the incidence of hallucinations and temporal misinformation. 2. Deploy a multi-layered safety guardrail system, including real-time streaming content monitoring and output filtering, to detect and block the dissemination of factually incorrect or harmful content. For applications involving high-stakes decision-making, establish a Human-in-the-Loop (HIL) review mechanism for output validation. 3. Establish a continuous model alignment and validation process, utilizing techniques like Reinforcement Learning from Human Feedback (RLHF) and adversarial fine-tuning to proactively reduce inherent biases and errors stemming from the training data, ensuring the model is systematically rewarded for generating truthful and factual information.

ADDITIONAL EVIDENCE

Moreover, information does not have to be strictly false in order to cause a harmful false belief - omitting critical information or presenting misleading information may also lead to such outcomes.