Membership Inference Attack
A membership inference attack figures out whether a specific record was in a model's training data by probing how the model responds. It's a privacy leak: confirming someone's data was used can itself expose sensitive information.
Also known as: membership inference
Models often behave a little differently on data they were trained on than on data they’ve never seen — they can be more confident, or fit it more tightly. A membership inference attack exploits that gap: by querying the model and studying its responses, an attacker infers whether a particular record was part of the training set.
That sounds abstract until the data is sensitive. Confirming that a specific person’s medical record, financial history, or private message was in a training set can be a serious privacy breach on its own, regardless of what the model outputs. It’s a leading reason regulated teams care about how models are trained and what they memorize. Defenses center on limiting memorization — techniques like differential privacy, careful handling of duplicates, and not overfitting — so the model’s behavior on training data looks no different from its behavior on anything else.