README.md 7.71 KB
Newer Older
Avishek Anand's avatar
Ad  
Avishek Anand committed
1
## Must-read papers on Interpretability and Explanations.
Jaspreet's avatar
minor  
Jaspreet committed
2 3
We must make a distinction between interpretable models and interpreting decisions made by models.

Avishek Anand's avatar
Ad  
Avishek Anand committed
4 5 6 7
We release [InterpretMe]

### Survey papers:

Avishek Anand's avatar
Avishek Anand committed
8 9
1. **Jaspreet's Master Piece**
*Jaspreet Singh* 2019. [paper](https://arxiv.org/pdf/xxx.pdf)
Avishek Anand's avatar
Ad  
Avishek Anand committed
10

11 12 13 14
1. **Interpretability of Machine Learning Models and Representations: an Introduction**
*Adrien Bibal and Benoît Frénay* 2018. [paper](https://pdfs.semanticscholar.org/4646/56fc6431f1db8b2e0b0b3093a5df1cb7958e.pdf)

1. **A Survey Of Methods For Explaining Black Box Models**
Avishek Anand's avatar
Avishek Anand committed
15
*Riccardo Guidotti, Anna Monreale, Franco Turini, Dino Pedreschi, Fosca Giannotti*. 2018. [paper](https://arxiv.org/pdf/1802.01933.pdf)
16 17 18 19 20 21 22 23 24




### Theses:

1. **Learning Interpretable Models**
*Stefan R¨uping* 2006. [paper](https://eldorado.tu-dortmund.de/bitstream/2003/23008/1/dissertation_rueping.pdf)

Avishek Anand's avatar
Avishek Anand committed
25 26
2. **Explaining Rankings**
*Maartje Anne ter Hoeve*.2017.[thesis](https://pdfs.semanticscholar.org/756e/28e7fa971b2c610605ee4223ec18544aa7cf.pdf)
Avishek Anand's avatar
Ad  
Avishek Anand committed
27 28 29
### Journal and Conference papers:

1. **Towards a rigorous science of interpretable machine learning.**
Avishek Anand's avatar
Avishek Anand committed
30
*Finale Doshi-Velez and Been Kim.*  2017. [paper](https://arxiv.org/pdf/1702.08608.pdf)
Avishek Anand's avatar
Ad  
Avishek Anand committed
31 32 33 34

1. **Streaming weak submodularity: Interpreting neural networks on the fly.**
*Ethan R Elenberg, Alexandros G Dimakis, Moran Feldman, and Amin Karbasi*. 2017 [paper](https://arxiv.org/pdf/1703.02647).

Avishek Anand's avatar
Avishek Anand committed
35
1. **Interpretable explanations of black boxes by meaningful perturbation.**
Avishek Anand's avatar
Avishek Anand committed
36
*Ruth C Fong and Andrea Vedaldi.*.CVPR 2017. [paper](https://arxiv.org/pdf/1704.03296.pdf)
Avishek Anand's avatar
Ad  
Avishek Anand committed
37

Avishek Anand's avatar
Avishek Anand committed
38
1. **Supervised topic models for clinical interpretability.**
Avishek Anand's avatar
Avishek Anand committed
39
*Michael C Hughes, Huseyin Melih Elibol, Thomas McCoy, Roy Perlis, and Finale Doshi-Velez*.2016. [paper](https://arxiv.org/pdf/1612.01678)
Avishek Anand's avatar
Ad  
Avishek Anand committed
40

Avishek Anand's avatar
Avishek Anand committed
41
1. **A unified approach to interpreting model predictions.**
Avishek Anand's avatar
Avishek Anand committed
42
*Scott Lundberg and Su-In Lee*.2016. [paper](https://arxiv.org/pdf/1705.07874)
Avishek Anand's avatar
Avishek Anand committed
43 44
 
1. **A human-grounded evaluation benchmark for local explanations of machine learning.**
Avishek Anand's avatar
Avishek Anand committed
45
*Sina Mohseni and Eric D Ragan*.2018. [paper](https://arxiv.org/pdf/1801.05075).
Avishek Anand's avatar
Ad  
Avishek Anand committed
46

Avishek Anand's avatar
Avishek Anand committed
47
1. **Anchors: High-precision model-agnostic explanations.**
Avishek Anand's avatar
Avishek Anand committed
48
*Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin*.AAAI 2018. [paper](https://homes.cs.washington.edu/~marcotcr/aaai18.pdf)
Avishek Anand's avatar
Ad  
Avishek Anand committed
49

Avishek Anand's avatar
Avishek Anand committed
50
1. **Right for the right reasons: Training differentiable models by constraining their explanations.**
Avishek Anand's avatar
Avishek Anand committed
51
*Andrew Slavin Ross, Michael C. Hughes, and Finale Doshi-Velez*.IJCAI 2018. [paper](https://doi.org/10.24963/ijcai.2017/371)
Avishek Anand's avatar
Avishek Anand committed
52 53 54 55 56

1. **Sharing Deep Neural Network Models with Interpretation.**
*Huijun Wu, Chen Wang, Jie Yin, Kai Lu and Liming Zhu*. WWW’18.  [paper](https://doi.org/10.24963/ijcai.2017/371)

1. **TEM:Tree-enhanced Embedding Model for Explainable Recommendation Xiang Wang.**
Avishek Anand's avatar
Avishek Anand committed
57
*Xiangnan He, Fuli Feng, Liqiang Nie and Tat-Seng Chua*. WWW’18. [paper](https://www.comp.nus.edu.sg/~xiangnan/papers/www18-tem.pdf)
Avishek Anand's avatar
Avishek Anand committed
58 59

1. **Towards Deep Interpretability (MUS-ROVER II): Learning Hierarchical Representations of Tonal Music.** 
Avishek Anand's avatar
Avishek Anand committed
60
*Haizi Yu, Lav R. Varshney*. ICLR’17. [paper](https://openreview.net/pdf?id=ryhqQFKgl)
Avishek Anand's avatar
Avishek Anand committed
61 62

1. **Generating Interpretable Images with Controllable Structure**
Avishek Anand's avatar
Avishek Anand committed
63
*Scott Reed, Aron van den Oord, Nal Kalchbrenner, Victor Bapst, Matt Botvinick, Nando de Freitas*. ICLR’17. [paper](http://www.scottreed.info/files/iclr2017.pdf)
Avishek Anand's avatar
Avishek Anand committed
64 65

1. **Supervised topic models for clinical interpretability.**
Avishek Anand's avatar
Avishek Anand committed
66
*Hughes et al.*. 2016.[paper](https://arxiv.org/pdf/1612.01678.pdf)
Avishek Anand's avatar
Avishek Anand committed
67 68

1. **An Effective and Interpretable Method for Document Classification**
Avishek Anand's avatar
Avishek Anand committed
69
*Ngo Van Linh, Nguyen Kim Anh, Khoat Than, Chien Nguyen Dang*. KAIS 2016.[paper](http://is.hust.edu.vn/~khoattq/papers/kais-2016.pdf)
Avishek Anand's avatar
Avishek Anand committed
70

Avishek Anand's avatar
Avishek Anand committed
71 72
1. **Interpretable probabilistic embeddings: bridging the gap between topic models and neural networks.**
*Anna Potapenko, Artem Popov, and Konstantin Vorontsov*. 2017.[paper](https://arxiv.org/pdf/1711.04154.pdf)
Avishek Anand's avatar
Avishek Anand committed
73 74

1. **Interpretable Explanations of Black Boxes by Meaningful Perturbation.** 
Avishek Anand's avatar
Avishek Anand committed
75
*Fong, Ruth C and Vedaldi, Andrea*.ICCV 2017.[paper](http://openaccess.thecvf.com/content_ICCV_2017/papers/Fong_Interpretable_Explanations_of_ICCV_2017_paper.pdf)
Avishek Anand's avatar
Avishek Anand committed
76 77

1. **Interpretable Convolutional Neural Networks with Dual Local and Global Attention for Review Rating Prediction.**
Avishek Anand's avatar
Avishek Anand committed
78
*Sungyong Seo, Jing Huang, Hao Yang, and Yan Liu*. Recsys 2017.[paper](https://dl.acm.org/citation.cfm?id=3109890)
Avishek Anand's avatar
Avishek Anand committed
79 80

1. **Explicit factor models for explainable recommendation based on phrase-level sentiment analysis.**
Avishek Anand's avatar
Avishek Anand committed
81
*Yongfeng Zhang,Guokun Lai,Min Zhang,Yi Zhang,Yiqun Liu,and Shaoping Ma*. SIGIR 2014.[paper](http://yongfeng.me/attach/efm-slice-zhang.pdf)
Avishek Anand's avatar
Avishek Anand committed
82 83

1. **What your images reveal: Exploiting visual contents for point-of-interest recommendation.**
Avishek Anand's avatar
Avishek Anand committed
84
*Suhang Wang, Yilin Wang, Jiliang Tang, Kai Shu,Suhas Ranganath,and Huan Liu*. WWW 2017.[paper](http://www.public.asu.edu/~swang187/publications/VPOI.pdf)
85 86 87 88 89 90 91 92 93 94

1. **A causal framework for explaining the predictions of black-box sequence-to-sequence models**
*David Alvarez-Melis, Tommi S. Jaakkola*. ACL 2017.[paper](http://www.aclweb.org/anthology/D17-1042)

1. **Why should i trust you?: Explaining the predictions of any classifier.**
*Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin*. SIGKDD 2016.[paper](https://chara.cs.illinois.edu/sites/fa16-cs591txt/pdf/Ribeiro-2016-KDD.pdf)

1. **Understanding Black-box Predictions via Influence Functions**
*Pang Wei Koh and Percy Liang*. ICML 2017.[paper](https://arxiv.org/pdf/1703.04730.pdf)

Avishek Anand's avatar
Avishek Anand committed
95
1. **Detecting and Correcting for Label Shift with Black Box Predictors**
Avishek Anand's avatar
Avishek Anand committed
96
*Zachary C. Lipton, Yu-Xiang Wang, Alex Smola*. ICLR 2018.[paper](https://arxiv.org/pdf/1802.03916.pdf)
Avishek Anand's avatar
Avishek Anand committed
97

Avishek Anand's avatar
Avishek Anand committed
98 99 100
1. **Visually Explainable Recommendation**.
*Chen et al.*.2018.[paper](https://arxiv.org/pdf/1801.10288.pdf)

Avishek Anand's avatar
Avishek Anand committed
101 102 103
1. **How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation**
*Menaka Narayanan, Emily Chen, Jeffrey He, Been Kim, Sam Gershman, Finale Doshi-Velez*. 2018.[paper](https://arxiv.org/abs/1802.00682)

Jaspreet's avatar
Jaspreet committed
104 105 106 107 108
1. **Hierarchical Attention Networks for Document Classification**
*Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, Eduard Hovy*. 2018.[paper](http://www.aclweb.org/anthology/N16-1174) [code](https://github.com/richliao/textClassifier)



Avishek Anand's avatar
Avishek Anand committed
109
## Relevance propagation 
Avishek Anand's avatar
Avishek Anand committed
110

Avishek Anand's avatar
Avishek Anand committed
111
1. **On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation**
Avishek Anand's avatar
Avishek Anand committed
112 113
*S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Mu ̈ller, and W. Samek*.PLOS one 2015,[paper](http://iphome.hhi.de/samek/pdf/BacPLOS15.pdf)

Avishek Anand's avatar
Avishek Anand committed
114
1. **Explaining nonlinear classification decisions with deep taylor decomposition**
Avishek Anand's avatar
Avishek Anand committed
115
*G Montavon, S Lapuschkin, A Binder, W Samek*. Pattern Recognition 17. [paper](http://iphome.hhi.de/samek/pdf/MonPR17.pdf). [code](https://github.com/sebastian-lapuschkin/lrp_toolbox). [tutorial](http://www.heatmapping.org/tutorial/)
116

Avishek Anand's avatar
Avishek Anand committed
117 118
1. **Exploring text datasets by visualizing relevant words**
*F Horn, L Arras, G Montavon, KR Müller, W Samek*. 2017. [paper](https://arxiv.org/pdf/1707.05261.pdf)
119

Avishek Anand's avatar
Avishek Anand committed
120 121 122 123 124 125 126 127 128 129 130
1. **Methods for Interpreting and Understanding Deep Neural Networks**
*G Montavon, W Samek, KR Müller*. Digital Signal Processing, 73:1-15, 2018. [paper](https://arxiv.org/abs/1706.07979)

1. **Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models**
*W Samek, T Wiegand, KR Müller*. ITU Journal: ICT Discoveries - Special Issue 1 2017. [paper](https://arxiv.org/abs/1708.08296)

1. **"What is Relevant in a Text Document?": An Interpretable Machine Learning Approach**
*L Arras, F Horn, G Montavon, KR Müller, W Samek*. PLOS ONE, 2017. [paper](https://arxiv.org/abs/1612.07843)

1. **Explaining NonLinear Classification Decisions with Deep Taylor Decomposition**
*G Montavon, S Lapuschkin, A Binder, W Samek, KR Müller*. Pattern Recognition, 2017. [paper](http://arxiv.org/abs/1512.02479)