A Way to Understand the Features of Deep Neural Networks by Network Inversion

Hong Wang,Xianzhong Chen,Jiangyun Li
DOI: https://doi.org/10.1007/978-981-15-1922-2_20
2019-01-01
Abstract:New variants of Deep Neural Networks (DNN) have been proposed continuously in recent years, and have led to impressive performance in a wide range of fields such as computer vision, natural language processing, and recommender systems. However, DNN are often criticized by the lack of interpretability. This paper proposes a network inversion method to understand the features extracted by DNN. DNN can be considered as a kind of transformation. When studying the characteristics and the features of a transformation, the inverse transformation is often involved. By comparing the inverted signal with the original one, better understanding of the features and properties of transformation can be achieved. In this paper, it has been found that the features extracted by a dimension-reduction layer in a DNN are essentially the special solution of the layer’s constraint equations, and the linear combination of the general solutions is neglected by the layer. This find-out should help to understand the structure and function of a DNN. The experiments in this paper showed the importance of this find-out.
What problem does this paper attempt to address?