Back to the MIT repository
2. Privacy & Security3 - Other

Risks from data (Risks of illegal collection and use of data)

The collection of AI training data and the interaction with users during service provision pose security risks, including collecting data without consent and improper use of data and personal information.

Source: MIT AI Risk Repositorymit687

ENTITY

1 - Human

INTENT

3 - Other

TIMING

3 - Other

Risk ID

mit687

Domain lineage

2. Privacy & Security

186 mapped risks

2.1 > Compromise of privacy by leaking or correctly inferring sensitive information

Mitigation strategy

1. Establish comprehensive Data Governance and Ethical Guidelines Require the implementation of a robust data governance framework and ethical guidelines that explicitly define the lawful basis for data collection, processing, and use. This framework must ensure transparency and adherence to data protection regulations (e.g., GDPR), including clear, informed consent mechanisms and provision for users to opt out of having their data utilized for model training. 2. Implement Data Protection by Design and Minimization Techniques Employ Data Protection by Design principles by making data minimization the default standard. This necessitates the rigorous application of pseudonymization, anonymization, and differential privacy techniques to all training datasets and user inputs. The goal is to exclude or obfuscate Personally Identifiable Information (PII) before model ingestion, thereby significantly reducing the risk of sensitive data exposure or memorization. 3. Enforce Least-Privilege Access Controls and Robust Encryption Mandate the enforcement of least-privilege and zero-trust access controls across all data storage locations, processing pipelines, and model endpoints. Furthermore, implement robust, layered encryption protocols for all data, both at rest and in transit, to prevent unauthorized access and data exfiltration from the AI system and its supporting infrastructure.