4. Malicious Actors & Misuse2 - Post-deployment

Domain-Specific Misuses

Improvements in LLMs may exert greater pressure to apply LLMs to various domains, such as health and education (Eloundou et al., 2023). Crude efforts to use LLMs in such domains, however, may incur harm and should be discouraged strongly. In particular, it is important to guard against different ways in which LLMs may be misused within any domain. One famous episode of misuse within the health sector is a mental health non-profit experimenting LLM-based therapy on its users without their informed consent (Xiang, 2023a). Within the education sector, LLMs may be misused in various ways that might impact student learning; e.g. as cheating accessory by the students or as (low quality) evaluator of student’s work by the instructors (Cotton et al., 2023). Recent findings in moral psychology also suggest that LLMs can generate moral evaluations that people perceive as superior to human judgments; these could be misused to create compelling yet harmful moral guidance (Aharoni et al., 2024). Similar risks of misuse may exist in other domains as well.

Source: MIT AI Risk Repositorymit1494

ENTITY

1 - Human

INTENT

1 - Intentional

TIMING

2 - Post-deployment

Risk ID

mit1494

Domain lineage

4. Malicious Actors & Misuse

223 mapped risks

4.3 > Fraud, scams, and targeted manipulation

Mitigation strategy

1. Implement **Reinforcement Learning from Human Feedback (RLHF)** and safety fine-tuning during model development to align LLM outputs with ethical and domain-specific standards, proactively mitigating toxicity, bias, and the generation of harmful or inaccurate guidance. 2. Establish **Human-in-the-Loop (HITL)** protocols, particularly for high-stakes applications such as health consultation or student assessment, requiring expert human review to critically validate LLM decisions and outputs before final deployment or application. 3. Ensure complete **user transparency and informed consent** by clearly disclosing the use of LLMs in applications and requiring explicit consent for data collection and usage, which directly addresses ethical concerns related to unauthorized experimentation in sensitive domains.