Towards Deep Learning Models Resistant to Adversarial Attacks

https://arxiv.org/pdf/1706.06083.pdf

ok so the idea here is we want to apply security mindset to safety against adversarial attacks.

providing guarantees for how safe a model is, against what attacks, against what strength of said attacks

specifying an attack model:

$max_{δ \in S} L (θ, x + δ, y)$

this is the general formalization of what it means, in this setup, to attack a model. the adversary wants to find the perturbation $δ$ in some allowed set of perturbations $S$ such that the loss $L$ of the model is maximized (here the model has parameters $θ$ ).

this doesn’t really specify anything, but this gives us levers to change in defining our attacks. a popular choice for $S$ is the $l_{\infty}$ -ball around $x$ . (cc: l-p metrics)

FGSM is a way to actually get

dron's garden!

Explorer

Towards Deep Learning Models Resistant to Adversarial Attacks

Graph View

Backlinks