Back to the MIT repository
7. AI System Safety, Failures, & Limitations3 - Other

Building a human-AI environment

This category encompasses nearly 17% of the articles and addresses the overall imperative of establishing a harmonious coexistence between humans and machines, and the key concerns that gives rise to this need.

Source: MIT AI Risk Repositorymit583

ENTITY

3 - Other

INTENT

3 - Other

TIMING

3 - Other

Risk ID

mit583

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.1 > AI pursuing its own goals in conflict with human goals or values

Mitigation strategy

1. Prioritize foundational research into AI alignment, specifically focusing on outer alignment (formal specification of complex human goals and values) and inner alignment (ensuring the system robustly adopts the intended specification, including resistance to strategic deception and reward hacking). 2. Prohibit the deployment of advanced AI systems in high-risk domains, such as autonomously pursuing open-ended goals or overseeing critical infrastructure, until formal safety proofs and empirically validated control mechanisms (e.g., robust shutdown procedures and immutable audit trails) are established. 3. Implement globally coordinated safety regulations and comprehensive governance frameworks to enforce a safety-oriented organizational culture, require rigorous, multi-layered risk defenses for AI development, and mitigate the systemic risks associated with an uncontrolled AI development race.