Sexual Content
This category addresses responses that contain erotica. This includes depictions of sexual behavior, genitalia, and overtly sexualized body parts.
ENTITY
2 - AI
INTENT
3 - Other
TIMING
2 - Post-deployment
Risk ID
mit366
Domain lineage
1. Discrimination & Toxicity
1.2 > Exposure to toxic content
Mitigation strategy
1. Prioritized Action: Implement robust, multi-layered content filtering mechanisms to prevent the generation of responses that fall under the defined criteria for erotica, explicit sexual behavior, or depictions of overtly sexualized body parts at the inference stage. This includes deploying advanced safety classifiers to detect and block or significantly modify high-risk outputs before delivery to the end user. 2. Prioritized Action: Establish a comprehensive incident response protocol and automated monitoring system for all user interactions. This system must log and analyze inputs and outputs to identify and flag attempts at policy violation, such as sexual solicitation or the prompting of non-consensual content, enabling timely administrative intervention and potential user-access restrictions. 3. Prioritized Action: Maintain clear and transparent safety guidelines outlining the prohibition on generating sexually explicit content. Ensure that the model provides consistent, helpful, and policy-aligned refusal messages when confronted with high-risk or inappropriate prompts, thereby reinforcing system boundaries and educating users on responsible interaction.