https://arxiv.org/pdf/1706.06083.pdf

ok so the idea here is we want to apply security mindset to safety against adversarial attacks.

providing guarantees for how safe a model is, against what attacks, against what strength of said attacks

specifying an attack model:

this is the general formalization of what it means, in this setup, to attack a model. the adversary wants to find the perturbation in some allowed set of perturbations such that the loss of the model is maximized (here the model has parameters ).

this doesn’t really specify anything, but this gives us levers to change in defining our attacks. a popular choice for is the -ball around . (cc: l-p metrics)

FGSM is a way to actually get