Interpret Neural Networks by Extracting Critical Subnetworks

Yulong Wang,Hang Su,Bo Zhang,Xiaolin Hu
DOI: https://doi.org/10.1109/tip.2020.2993098
IF: 10.6
2020-01-01
IEEE Transactions on Image Processing
Abstract:In recent years, deep neural networks have achieved excellent performance in many fields of artificial intelligence. The requirements for the interpretability and robustness of neural networks are also increasing. In this paper, we propose to understand the functional mechanism of neural networks by extracting critical subnetworks. Specifically, we denote the critical subnetworks as a group of important channels across layers such that if they were suppressed to zeros, the final test performance would deteriorate severely. This novel perspective can not only reveal the layerwise semantic behavior within the model but also present more accurate visual explanations appearing in the data through attribution methods. Moreover, we propose two adversarial example detection methods based on the properties of sample-specific and class-specific subnetworks, which provides the possibility for increasing the model robustness.
What problem does this paper attempt to address?