...  ...  @@ 3,14 +3,18 @@ 


* Adversarial attacks on explanation maps:



* they can be changed to an arbitrary target map by applying visually hardly perceptible input pertubation



* The pertubation does not change the output of the network for that input



* $`\rightarrow`$ explanation maps are not robustly interpretable



* $`\rightarrow`$ explanation maps are not robust



<img src="uploads/expl_manipulate_fig1.png" width="300">



* This phenomenon is related to geometry of the networks output manifold



* We can derive a bound on the degree of possible manipulation. The bound is proportional to two differential geometric quantities:



* principle curvatures



* geodesic distance between original input and manipulated counterpart



* Using this insight to limit possible ways of manipulations $`\rightarrow`$ enhance resilience of explanation methods






* inputs that are similar to each other (L2) can have explanations that are drastically different, as geodesic distance can be substantially greater than L2



* Using softplus with small $`\beta`$ makes explanatios more robust in terms of manipulations.



* large curvature of NNs decision function is responsible for vulnerability



* softplus leads to reduced maximal curvature (smoothing kinks) compared to ReLU



* can use softplus **only** for the explanation generation, leaving original network as is



* doing this is much faster than SmoothGrad method






## Manipulation of explanations



* Used explanation methods: Gradientbased and propagationbased

...  ...  @@ 27,12 +31,14 @@ For the manipulated image $`x_{adv} = x + \delta x`$: 





Loss function: optimize $`\mathcal{L} = h(x_{adv})  h^t^2 + \gamma g(x_{adv})  g(x)^2`$ w.r.t. $`x_{adv}`$, $`\gamma \in \mathbb{R_+}`$ is hyperparam






* Requires to compute gradient w.r.t. the input $`\Delta h(x)`$ of the explanation (if explanation map is first order gradient, one needs second order gradient to optimize it)



* Requires to compute gradient of both the network output and the generated explanation map w.r.t. the input



* if explanation map is based on first order gradient, one needs second order gradient to optimize it



* ReLU has vanishing second derivative $`\rightarrow`$ replace ReLU with softplus






### Experiments



* Qualitative Analysis: Target is closely emulated, pertubation is small.



* Quantitative Analysis: measure SSIM, PCC, MSE between both target and manipulated explanation map as well as original image and perturbed image.



* ReLU vs softplus (with different $`\beta`$)









<img src="uploads/expl_manipulated_fig2.png" width="800"> 


\ No newline at end of file 


<img src="uploads/expl_manipulated_fig2.png" width="800"> 