... | ... | @@ -3,14 +3,18 @@ |
|
|
* Adversarial attacks on explanation maps:
|
|
|
* they can be changed to an arbitrary target map by applying visually hardly perceptible input pertubation
|
|
|
* The pertubation does not change the output of the network for that input
|
|
|
* $`\rightarrow`$ explanation maps are not robustly interpretable
|
|
|
* $`\rightarrow`$ explanation maps are not robust
|
|
|
<img src="uploads/expl_manipulate_fig1.png" width="300">
|
|
|
* This phenomenon is related to geometry of the networks output manifold
|
|
|
* We can derive a bound on the degree of possible manipulation. The bound is proportional to two differential geometric quantities:
|
|
|
* principle curvatures
|
|
|
* geodesic distance between original input and manipulated counterpart
|
|
|
* Using this insight to limit possible ways of manipulations $`\rightarrow`$ enhance resilience of explanation methods
|
|
|
|
|
|
* inputs that are similar to each other (L2) can have explanations that are drastically different, as geodesic distance can be substantially greater than L2
|
|
|
* Using softplus with small $`\beta`$ makes explanatios more robust in terms of manipulations.
|
|
|
* large curvature of NNs decision function is responsible for vulnerability
|
|
|
* softplus leads to reduced maximal curvature (smoothing kinks) compared to ReLU
|
|
|
* can use softplus **only** for the explanation generation, leaving original network as is
|
|
|
* doing this is much faster than SmoothGrad method
|
|
|
|
|
|
## Manipulation of explanations
|
|
|
* Used explanation methods: Gradient-based and propagation-based
|
... | ... | @@ -27,12 +31,14 @@ For the manipulated image $`x_{adv} = x + \delta x`$: |
|
|
|
|
|
Loss function: optimize $`\mathcal{L} = ||h(x_{adv}) - h^t||^2 + \gamma ||g(x_{adv}) - g(x)||^2`$ w.r.t. $`x_{adv}`$, $`\gamma \in \mathbb{R_+}`$ is hyperparam
|
|
|
|
|
|
* Requires to compute gradient w.r.t. the input $`\Delta h(x)`$ of the explanation (if explanation map is first order gradient, one needs second order gradient to optimize it)
|
|
|
* Requires to compute gradient of both the network output and the generated explanation map w.r.t. the input
|
|
|
* if explanation map is based on first order gradient, one needs second order gradient to optimize it
|
|
|
* ReLU has vanishing second derivative $`\rightarrow`$ replace ReLU with softplus
|
|
|
|
|
|
### Experiments
|
|
|
* Qualitative Analysis: Target is closely emulated, pertubation is small.
|
|
|
* Quantitative Analysis: measure SSIM, PCC, MSE between both target and manipulated explanation map as well as original image and perturbed image.
|
|
|
* ReLU vs softplus (with different $`\beta`$)
|
|
|
|
|
|
|
|
|
<img src="uploads/expl_manipulated_fig2.png" width="800"> |