Building a human-AI environment
This category encompasses nearly 17% of the articles and addresses the overall imperative of establishing a harmonious coexistence between humans and machines, and the key concerns that gives rise to this need.
ENTITY
3 - Other
INTENT
3 - Other
TIMING
3 - Other
Risk ID
mit583
Domain lineage
7. AI System Safety, Failures, & Limitations
7.1 > AI pursuing its own goals in conflict with human goals or values
Mitigation strategy
1. Prioritize foundational research into AI alignment, specifically focusing on outer alignment (formal specification of complex human goals and values) and inner alignment (ensuring the system robustly adopts the intended specification, including resistance to strategic deception and reward hacking). 2. Prohibit the deployment of advanced AI systems in high-risk domains, such as autonomously pursuing open-ended goals or overseeing critical infrastructure, until formal safety proofs and empirically validated control mechanisms (e.g., robust shutdown procedures and immutable audit trails) are established. 3. Implement globally coordinated safety regulations and comprehensive governance frameworks to enforce a safety-oriented organizational culture, require rigorous, multi-layered risk defenses for AI development, and mitigate the systemic risks associated with an uncontrolled AI development race.