What is Evasion Attack?

Evasion Attack — AI Glossary

Evasion Attack

An evasion attack crafts an input designed to slip past a model's classifier or safety check at inference time — a spam message tweaked to read as legitimate, a malicious payload perturbed to look benign. The model isn't compromised; it's fooled by an input built to exploit its blind spots.

Also known as: adversarial evasion

Jun 17, 2026 · Chain of Thought

Evasion attacks target the model at inference time rather than corrupting its training. The attacker takes something the model should flag — spam, fraud, malware, disallowed content — and perturbs it just enough that the model misclassifies it as safe, while a human would still see what it is. Classic examples are adversarially modified images that fool a vision classifier, and the constant cat-and-mouse of spam and abuse filters.

They work because models learn decision boundaries that have exploitable gaps, and attackers probe for inputs that land on the wrong side. Defenses stack: adversarial training (showing the model evasive examples), input validation, ensembles, and not relying on a single model as the only gate. For anything security-sensitive, assume a motivated attacker will probe the boundary and design so a single fooled model isn’t the whole defense.

Evasion Attack

Related terms