Explaining the black-box model: A survey of local interpretation methods for deep neural networks

Yu Liang,Siguang Li,Chungang Yan,Maozhen Li,Changjun Jiang
DOI: https://doi.org/10.1016/j.neucom.2020.08.011
IF: 6
2021-01-01
Neurocomputing
Abstract:Recently, a significant amount of research has been investigated on interpretation of deep neural networks (DNNs) which are normally processed as black box models. Among the methods that have been developed, local interpretation methods stand out which have the features of clear expression in interpretation and low computation complexity. Different from existing surveys which cover a broad range of methods on interpretation of DNNs, this survey focuses on local interpretation methods with an in-depth analysis of the representative works including the newly proposed approaches. From the perspective of principles, we first divide local interpretation methods into two main categories: model-driven methods and data-driven methods. Then we make a fine-grained distinction between the two types of these methods, and highlight the latest ideas and principles. We further demonstrate the effects of a number of interpretation methods by reproducing the results through open source software plugins. Finally, we point out research directions in this rapidly evolving field.
What problem does this paper attempt to address?