research-article Free Access
- Authors:
- Aya Abdelsalam Ismail Department of Computer Science, University of Marylandm, Data Science and Statistical Computing, Genentech, Inc.
Department of Computer Science, University of Marylandm, Data Science and Statistical Computing, Genentech, Inc.
Search about this author
- Soheil Feizi Department of Computer Science, University of Maryland, Data Science and Statistical Computing, Genentech, Inc.
Department of Computer Science, University of Maryland, Data Science and Statistical Computing, Genentech, Inc.
Search about this author
- Héctor Corrada Bravo Department of Computer Science, University of Maryland, Data Science and Statistical Computing, Genentech, Inc.
Department of Computer Science, University of Maryland, Data Science and Statistical Computing, Genentech, Inc.
Search about this author
NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing SystemsDecember 2021Article No.: 2047Pages 26726–26739
Published:10 June 2024Publication History
- 0citation
- 0
- Downloads
Metrics
Total Citations0Total Downloads0Last 12 Months0
Last 6 weeks0
- Get Citation Alerts
New Citation Alert added!
This alert has been successfully added and will be sent to:
You will be notified whenever a record that you have chosen has been cited.
To manage your alert preferences, click on the button below.
Manage my Alerts
New Citation Alert!
Please log in to your account
- Publisher Site
NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing Systems
Improving deep learning interpretability by saliency guided training
Pages 26726–26739
PreviousChapterNextChapter
ABSTRACT
Saliency methods have been widely used to highlight important input features in model predictions. Most existing methods use backpropagation on a modified gradient function to generate saliency maps. Thus, noisy gradients can result in unfaithful feature attributions. In this paper, we tackle this issue and introduce a saliency guided training procedure for neural networks to reduce noisy gradients used in predictions while retaining the predictive performance of the model. Our saliency guided training procedure iteratively masks features with small and potentially noisy gradients while maximizing the similarity of model outputs for both masked and unmasked inputs. We apply the saliency guided training procedure to various synthetic and real data sets from computer vision, natural language processing, and time series across diverse neural architectures, including Recurrent Neural Networks, Convolutional Networks, and Transformers. Through qualitative and quantitative evaluations, we show that saliency guided training procedure significantly improves model interpretability across various domains while preserving its predictive performance.
Skip Supplemental Material Section
Supplemental Material
Available for Download
3540261.3542308_supp.pdf (15.3 MB)
Supplemental material.
References
- Julius Adebayo, Justin Gilmer, Michael Muelly, Ian Goodfellow, Moritz Hardt, and Been Kim. Sanity checks for saliency maps. In Advances in Neural Information Processing Systems, 2018.Google Scholar
- Julius Adebayo, Michael Muelly, Ilaria Liccardi, and Been Kim. Debugging tests for model explanations. arXiv preprint arXiv:2011.05429, 2020.Google Scholar
- Marco Ancona, Enea Ceolini, Cengiz Öztireli, and Markus Gross. Towards better understanding of gradient-based attribution methods for deep neural networks. International Conference on Learning Representations, 2018.Google Scholar
- Jimmy Ba and Rich Caruana. Do deep nets really need to be deep? In Advances in Neural Information Processing Systems, 2014.Google Scholar
Digital Library
- Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. In PLoS ONE, 2015.Google Scholar
Cross Ref
- David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe, Katja Hansen, and Klaus-Robert MÞller. How to explain individual classification decisions. In Journal of Machine Learning Research, 2010.Google Scholar
Digital Library
- Shaojie Bai, J Zico Kolter, and Vladlen Koltun. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271, 2018.Google Scholar
- Oana-Maria Camburu, Tim Rocktäschel, Thomas Lukasiewicz, and Phil Blunsom. e-snli: Natural language inference with natural language explanations. arXiv preprint arXiv:1812.01193, 2018.Google Scholar
- Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In International conference on knowledge discovery and data mining, 2015.Google Scholar
Digital Library
- Terrance DeVries and Graham W Taylor. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552, 2017.Google Scholar
- Jay DeYoung, Sarthak Jain, Nazneen Fatema Rajani, Eric Lehman, Caiming Xiong, Richard Socher, and Byron C Wallace. Eraser: A benchmark to evaluate rationalized nlp models. arXiv preprint arXiv:1911.03429, 2019.Google Scholar
- Nicholas Frosst and Geoffrey Hinton. Distilling a neural network into a soft decision tree. arXiv preprint arXiv:1711.09784, 2017.Google Scholar
- Gerry. 265 bird species, 2021. URL https://www.kaggle.com/gpiosenka/100-bird-species/.Google Scholar
- Reza Ghaeini, Xiaoli Z Fern, Hamed Shahbazi, and Prasad Tadepalli. Saliency learning: Teaching the model where to pay attention. arXiv preprint arXiv:1902.08649, 2019.Google Scholar
- Amirata Ghorbani, Abubakar Abid, and James Zou. Interpretation of neural networks is fragile. In AAAI Conference on Artificial Intelligence, 2019.Google Scholar
Digital Library
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.Google Scholar
Cross Ref
- Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. In Neural computation, 1997.Google Scholar
Digital Library
- Sara Hooker, Dumitru Erhan, Pieter-Jan Kindermans, and Been Kim. A benchmark for interpretability methods in deep neural networks. In Advances in Neural Information Processing Systems, 2019.Google Scholar
- Qibin Hou, Peng-Tao Jiang, Yunchao Wei, and Ming-Ming Cheng. Self-erasing network for integral object attention. arXiv preprint arXiv:1810.09821, 2018.Google Scholar
- Aya Abdelsalam Ismail, Mohamed Gunady, Luiz Pessoa, Hector Corrada Bravo, and Soheil Feizi. Input-cell attention reduces vanishing saliency of recurrent neural networks. In Advances in Neural Information Processing Systems, 2019.Google Scholar
- Aya Abdelsalam Ismail, Mohamed Gunady, Héctor Corrada Bravo, and Soheil Feizi. Benchmarking deep learning interpretability in time series predictions. arXiv preprint arXiv:2010.13924, 2020.Google Scholar
- Pieter-Jan Kindermans, Kristof Schütt, Klaus-Robert Müller, and Sven Dähne. Investigating the influence of noise and distractors on the interpretation of neural networks. arXiv preprint arXiv:1611.07270, 2016.Google Scholar
- Pieter-Jan Kindermans, Kristof T Schütt, Maximilian Alber, Klaus-Robert Müller, Dumitru Erhan, Been Kim, and Sven Dähne. Learning how to explain neural networks: Patternnet and patternattribution. arXiv preprint arXiv:1705.05598, 2017.Google Scholar
- Pieter-Jan Kindermans, Sara Hooker, Julius Adebayo, Maximilian Alber, Kristof T Schütt, Sven Dähne, Dumitru Erhan, and Been Kim. The (un) reliability of saliency methods. In Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, 2019.Google Scholar
- Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. Citeseer, 2009.Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 2012.Google Scholar
Digital Library
- Solomon Kullback and Richard A Leibler. On information and sufficiency. The annals of mathematical statistics, 1951.Google Scholar
- Colin Lea, Michael Flynn, Rene Vidal, Austin Reiter, and Gregory Hager. Temporal convolu-tional networks for action segmentation and detection. In Conference on Computer Vision and Pattern Recognition, 2017.Google Scholar
Cross Ref
- Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998.Google Scholar
Cross Ref
- Yann LeCun, Corinna Cortes, and CJ Burges. Mnist handwritten digit database, 2010.Google Scholar
- Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 2015.Google Scholar
- Alexander Levine, Sahil Singla, and Soheil Feizi. Certifiably robust interpretation in deep learning. arXiv preprint arXiv:1905.12105, 2019.Google Scholar
- Kunpeng Li, Ziyan Wu, Kuan-Chuan Peng, Jan Ernst, and Yun Fu. Tell me where to look: Guided attention inference network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.Google Scholar
Cross Ref
- Zachary C Lipton. The mythos of model interpretability. In Queue, 2018.Google Scholar
Digital Library
- Yin Lou, Rich Caruana, and Johannes Gehrke. Intelligible models for classification and regression. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, 2012.Google Scholar
Digital Library
- Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, 2017.Google Scholar
- Ziad Obermeyer and Ezekiel J Emanuel. Predicting the future—big data, machine learning, and clinical medicine. In The New England journal of medicine, 2016.Google Scholar
Cross Ref
- Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499, 2016.Google Scholar
- Jeffrey Pennington, Richard Socher, and Christopher D Manning. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014.Google Scholar
Cross Ref
- Simon Perkins, Kevin Lacker, and James Theiler. Grafting: Fast, incremental feature selection by gradient descent in function space. The Journal of Machine Learning Research, 2003.Google Scholar
- Vitali Petsiuk, Abir Das, and Kate Saenko. Rise: Randomized input sampling for explanation of black-box models. arXiv preprint arXiv:1806.07421, 2018.Google Scholar
- Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. " why should i trust you?" explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016.Google Scholar
Digital Library
- Michael L Rich. Machine learning, automated suspicion algorithms, and the fourth amendment. In University of Pennsylvania Law Review, 2016.Google Scholar
- Andrew Slavin Ross, Michael C Hughes, and Finale Doshi-Velez. Right for the right reasons: Training differentiable models by constraining their explanations. arXiv preprint arXiv:1703.03717, 2017.Google Scholar
- Wojciech Samek, Alexander Binder, Grégoire Montavon, Sebastian Lapuschkin, and Klaus-Robert Müller. Evaluating the visualization of what a deep neural network has learned. IEEE transactions on neural networks and learning systems, 2016.Google Scholar
- Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, 2017.Google Scholar
Cross Ref
- Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. Learning important features through propagating activation differences. In International Conference on Machine Learning, 2017.Google Scholar
Digital Library
- Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.Google Scholar
- Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. CoRR, 2013.Google Scholar
- Krishna Kumar Singh and Yong Jae Lee. Hide-and-seek: Forcing a network to be meticulous for weakly-supervised object and action localization. In 2017 IEEE international conference on computer vision (ICCV). IEEE, 2017.Google Scholar
Cross Ref
- Sahil Singla, Eric Wallace, Shi Feng, and Soheil Feizi. Understanding impacts of high-order loss approximations and features in deep learning interpretation. In International Conference on Machine Learning. PMLR, 2019.Google Scholar
- Daniel Smilkov, Nikhil Thorat, Been Kim, Fernanda Viégas, and Martin Wattenberg. Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825, 2017.Google Scholar
- Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. In International Conference on Machine Learning, 2017.Google Scholar
- Harini Suresh, Nathan Hunt, Alistair Johnson, Leo Anthony Celi, Peter Szolovits, and Marzyeh Ghassemi. Clinical intervention prediction and understanding using deep networks. arXiv preprint arXiv:1705.08498, 2017.Google Scholar
- James Thorne, Andreas Vlachos, Christos Christodoulopoulos, and Arpit Mittal. Fever: a large-scale dataset for fact extraction and verification. arXiv preprint arXiv:1803.05355, 2018.Google Scholar
- Richard Tomsett, Dan Harborne, Supriyo Chakraborty, Prudhvi Gurram, and Alun Preece. Sanity checks for saliency metrics. In Proceedings of the AAAI conference on artificial intelligence, 2020.Google Scholar
Cross Ref
- Sana Tonekaboni, Shalmali Joshi, Kieran Campbell, David Duvenaud, and Anna Goldenberg. What went wrong and when? instance-wise feature importance for time-series models. arXiv preprint arXiv:2003.02821, 2020.Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in Neural Information Processing Systems, 2017.Google Scholar
Digital Library
- Lezi Wang, Ziyan Wu, Srikrishna Karanam, Kuan-Chuan Peng, Rajat Vikram Singh, Bo Liu, and Dimitris N Metaxas. Sharpen focus: Learning with attention separability and consistency. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.Google Scholar
- Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, Yao Zhao, and Shuicheng Yan. Object region mining with adversarial erasing: A simple classification to semantic segmentation approach. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2017.Google Scholar
Cross Ref
- Mike Wu, Michael C Hughes, Sonali Parbhoo, Maurizio Zazzi, Volker Roth, and Finale Doshi-Velez. Beyond sparsity: Tree regularization of deep models for interpretability. In AAAI Conference on Artificial Intelligence, 2018.Google Scholar
- Omar Zaidan and Jason Eisner. Modeling annotators: A generative approach to learning from annotator rationales. In Proceedings of the 2008 conference on Empirical methods in natural language processing, 2008.Google Scholar
Cross Ref
- Matthew D Zeiler and Rob Fergus. Visualizing and understanding convolutional networks. In European conference on computer vision, 2014.Google Scholar
Cross Ref
- Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.Google Scholar
Cross Ref
Cited By
View all
Recommendations
- A depth perception and visual comfort guided computational model for stereoscopic 3D visual saliency
With the emerging development of three-dimensional (3D) related technologies, 3D visual saliency modeling is becoming particularly important and challenging. This paper presents a new depth perception and visual comfort guided saliency computational ...
Read More
- Co-Saliency Detection Guided by Group Weakly Supervised Learning
The detection results of many existing co-saliency detection methods are easily interfered by the unrelated salient objects, which have similar appearance characteristics to co-salient objects. Therefore, mining the inter-saliency cues which contain the ...
Read More
- Visual saliency guided video compression algorithm
Recently Saliency maps from input images are used to detect interesting regions in images/videos and focus on processing these salient regions. This paper introduces a novel, macroblock level visual saliency guided video compression algorithm. This is ...
Read More
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in
Full Access
Get this Publication
- Information
- Contributors
Published in
NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing Systems
December 2021
30517 pages
ISBN:9781713845393
- Editors:
- M. Ranzato,
- A. Beygelzimer,
- Y. Dauphin,
- P.S. Liang,
- J. Wortman Vaughan
Copyright © 2021 Neural Information Processing Systems Foundation, Inc.
Sponsors
In-Cooperation
Publisher
Curran Associates Inc.
Red Hook, NY, United States
Publication History
- Published: 10 June 2024
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics
- Bibliometrics
- Citations0
Article Metrics
- View Citations
Total Citations
Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet
Digital Edition
View this article in digital edition.
View Digital Edition
- Figures
- Other
Close Figure Viewer
Browse AllReturn
Caption
View Table of Contents
Export Citations
Your Search Results Download Request
We are preparing your search results for download ...
We will inform you here when the file is ready.
Download now!
Your Search Results Download Request
Your file of search results citations is now ready.
Download now!
Your Search Results Download Request
Your search export query has expired. Please try again.