|
# Explanations can be manipulated and geometry is to blame
|
|
# Explanations can be manipulated and geometry is to blame
|
|
**Keys:**
|
|
**Keys:**
|
|
<img src="uploads/expl_manipulate_fig1.png" width="300" style="float: right;">
|
|
|
|
* Adversarial attacks on explanation maps:
|
|
* Adversarial attacks on explanation maps:
|
|
* they can be changed to an arbitrary target map by applying visually hardly perceptible input pertubation
|
|
* they can be changed to an arbitrary target map by applying visually hardly perceptible input pertubation
|
|
* The pertubation does not change the output of the network for that input
|
|
* The pertubation does not change the output of the network for that input
|
|
* $`\rightarrow`$ explanation maps are not robustly interpretable
|
|
* $`\rightarrow`$ explanation maps are not robustly interpretable
|
|
|
|
<img src="uploads/expl_manipulate_fig1.png" width="300">
|
|
|
|
|
|
|
|
|
|
* This phenomenon is related to geometry of the networks output manifold
|
|
* This phenomenon is related to geometry of the networks output manifold
|
|
* We can derive a bound on the degree of possible manipulation. The bound is proportional to two differential geometric quantities:
|
|
* We can derive a bound on the degree of possible manipulation. The bound is proportional to two differential geometric quantities:
|
|
* principle curvatures
|
|
* principle curvatures
|
... | | ... | |