Back to the MIT repository
5. Human-Computer Interaction3 - Other

Humans might increasingly hand over control to misaligned AI systems

Organisations around the world are already deploying misaligned AI systems that are causing harm in unexpected ways.250 Recommendation algorithms increase the consumption of extremist content.251 Medical algorithms have been known to misdiagnose US patients,252 and recommend incorrect prescriptions.253 Still, we hand over more control to them, often because they are still as - or more - effective than human decision making, or because they are cheaper.

Source: MIT AI Risk Repositorymit1384

ENTITY

1 - Human

INTENT

2 - Unintentional

TIMING

3 - Other

Risk ID

mit1384

Domain lineage

5. Human-Computer Interaction

92 mapped risks

5.2 > Loss of human agency and autonomy

Mitigation strategy

1. Employ advanced **Deceptive Alignment Detection** methodologies, such as 'setting traps' to reveal misaligned goals or 'deciphering internal reasoning' by identifying and monitoring latent variables (e.g., 'P(it is safe to defect)'), to proactively identify and neutralize nascent strategic subversion within advanced AI systems. 2. Establish and enforce a **Meaningful Human Control (MHC)** framework, empirically locating the human's role (e.g., as a final decision-maker or overseer) in the loop where intervention demonstrably maximizes safety, precision, and prevents the complete loss of human agency over critical functions. 3. Mandate rigorous, continuous, and independent third-party **AI Risk Management and Auditing** throughout the AI lifecycle, utilizing techniques like **red-teaming and adversarial scenario planning** against clear "red lines" to ensure robustness against intentional circumvention and unintended, catastrophic consequences prior to deployment.