Sex-related crimes
This category addresses responses that enable, encourage, or endorse the commission of sex-related crimes
ENTITY
2 - AI
INTENT
3 - Other
TIMING
2 - Post-deployment
Risk ID
mit356
Domain lineage
1. Discrimination & Toxicity
1.2 > Exposure to toxic content
Mitigation strategy
1. Mandate the implementation of **stringent runtime behavioral monitoring and post-generation content filtering** on all AI outputs and user interactions to proactively detect and block responses that enable, encourage, or endorse sex-related crimes, including the creation or sharing of Child Sexual Abuse Material (CSAM) and online grooming. 2. Establish a comprehensive, **continuous monitoring system** to track model performance and security posture in real-time at the inference endpoint, specifically designed to identify and flag adversarial input attempts (e.g., "jailbreaking") and anomalous query patterns that aim to circumvent established safety safeguards. 3. Require the maintenance of **immutable audit trails and detailed documentation** of all AI-driven decisions, content moderation events, and policy adherence checks. This system is critical for compliance with emerging regulations, enabling robust post-incident forensic analysis, and facilitating the iterative refinement of safety classifiers.