2. Privacy & Security2 - Post-deployment

Membership inference attack

A membership inference attack repeatedly queries a model to determine whether a given input was part of the model’s training. More specifically, given a trained model and a data sample, an attacker samples the input space, observing outputs to deduce whether that sample was part of the model's training.

Source: MIT AI Risk Repositorymit1292

ENTITY

1 - Human

INTENT

1 - Intentional

TIMING

2 - Post-deployment

Risk ID

mit1292

Domain lineage

2. Privacy & Security

186 mapped risks

2.2 > AI system security vulnerabilities and attacks

Mitigation strategy

1. Implement Differential Privacy (DP) to provide a formal and mathematically rigorous defense. This guarantees that the influence of any single training record on the model's output distribution is bounded, thus providing provable privacy against MIAs, although it may incur a trade-off with model utility. 2. Employ Structural Ensemble Architectures with Adaptive Inference, such as the Split-AI component of the SELENA framework. This involves training multiple sub-models on random subsets of the training data and using an adaptive test-time strategy that only aggregates outputs from sub-models that did not use the queried sample in their training set. This is a structural approach to enforce similar model behavior on member and non-member inputs. 3. Apply Privacy-Preserving Data Preprocessing or Data Synthesis Techniques. This involves transforming the sensitive training data before model learning, for example, using generative models (like in Diffusion-Driven Data Preprocessing or D3P) or rigorous pseudonymization/anonymization to alter the fine-grained statistical characteristics of the input and reduce exploitable membership signals.