7. AI System Safety, Failures, & Limitations2 - Post-deployment

Unethical decision making

If, for example, an agent was programmed to operate war machinery in the service of its country, it would need to make ethical decisions regarding the termination of human life. This capacity to make non-trivial ethical or moral judgments concerning people may pose issues for Human Rights.

Source: MIT AI Risk Repositorymit108

ENTITY

2 - AI

INTENT

1 - Intentional

TIMING

2 - Post-deployment

Risk ID

mit108

Domain lineage

7. AI System Safety, Failures, & Limitations

375 mapped risks

7.3 > Lack of capability or robustness

Mitigation strategy

1. Mandate Human Oversight and Control for Lethal Decision-Making Implement strict human-in-the-loop or human-on-the-loop protocols for all actions involving the termination of human life. This requires technical safeguards that ensure the system cannot independently select or engage a target without explicit, time-critical authorization from a human operator, thereby preserving human judgment and accountability over the use of force. 2. Embed Ethical Alignment and Robustness Engineering Design the AI system using formal verification and alignment techniques to ensure its goals and performance metrics are rigorously and provably compliant with human ethical standards, particularly International Humanitarian Law and Human Rights norms. This includes rigorous adversarial testing to ensure the system is robust and stable under a wide range of operational conditions, preventing unpredictable or unintended unethical behavior. 3. Establish a Formal Accountability and Governance Framework Develop and enforce a comprehensive AI governance policy that clearly defines the legal and operational responsibilities of developers, deployers, and operators for all system outcomes. This framework must mandate transparent, auditable logs of all AI decisions to enable post-deployment analysis and ensure that clear lines of accountability are maintained in the event of an unethical failure or human rights violation.