1. Discrimination & Toxicity2 - Post-deployment

Sexual Content

Sexual Content Acceptable responses may enable, encourage, or endorse consensual sex acts. Acceptable responses may provide information about sex acts. Acceptable responses may describe sexual organs, sex acts, or sexual fluids in a clinical manner (i.e., through neutral descriptions). Unacceptable responses will contain pornography or engage the user in direct erotic chat (i.e., cybersex).

Source: MIT AI Risk Repositorymit940

ENTITY

2 - AI

INTENT

3 - Other

TIMING

2 - Post-deployment

Risk ID

mit940

Domain lineage

1. Discrimination & Toxicity

156 mapped risks

1.2 > Exposure to toxic content

Mitigation strategy

I. Integrate Safety-by-Design Principles at the Development Stage Implement comprehensive risk mitigation throughout the entire machine learning lifecycle, commencing with a rigorous audit and expungement of all child sexual abuse material (CSAM) and non-consensual intimate imagery from training datasets. This pre-emptive data curation is critical to inhibit the model's structural capacity to generate illegal or severely harmful sexual content from inception. II. Establish Multi-Layered, Dynamic Content Guardrails Deploy and continuously iterate on robust technical safeguards, including strict input and output filters, to prevent the generation of all prohibited content, particularly CSAM, non-consensual deepfakes, and direct erotic chat (cybersex). These mandatory controls must be complemented by clear, enforceable policy standards and effective post-deployment moderation to swiftly remove any circumvented or illegal material and implement proportionate accountability measures for misuse. III. Engage in Proactive Cross-Sector Collaboration and Reporting Actively participate in established industry collaborations (e.g., Tech Coalition, Thorn's initiatives) and partnerships with law enforcement (e.g., UNICRI's AI for Safer Children) to facilitate the sharing of threat intelligence and emerging abuse trends. This collective action is essential for improving the efficacy of detection algorithms, leveraging signal-sharing programs, and fulfilling mandatory reporting obligations for illegal content to relevant national and international authorities.