Pruning Deep Convolutional Neural Network Using Conditional Mutual Information

Tien Vu-Van,Dat Du Thanh,Nguyen Ho,Mai Vu
2024-11-28
Abstract:Convolutional Neural Networks (CNNs) achieve high performance in image classification tasks but are challenging to deploy on resource-limited hardware due to their large model sizes. To address this issue, we leverage Mutual Information, a metric that provides valuable insights into how deep learning models retain and process information through measuring the shared information between input features or output labels and network layers. In this study, we propose a structured filter-pruning approach for CNNs that identifies and selectively retains the most informative features in each layer. Our approach successively evaluates each layer by ranking the importance of its feature maps based on Conditional Mutual Information (CMI) values, computed using a matrix-based Renyi {\alpha}-order entropy numerical method. We propose several formulations of CMI to capture correlation among features across different layers. We then develop various strategies to determine the cutoff point for CMI values to prune unimportant features. This approach allows parallel pruning in both forward and backward directions and significantly reduces model size while preserving accuracy. Tested on the VGG16 architecture with the CIFAR-10 dataset, the proposed method reduces the number of filters by more than a third, with only a 0.32% drop in test accuracy.
Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is: how to significantly reduce the size and complexity of the model through structured pruning without affecting the performance of the Convolutional Neural Network (CNN), so that it can be deployed on hardware with limited resources. Specifically, the author proposes a structured filter pruning method based on Conditional Mutual Information (CMI). This method effectively prunes the CNN by evaluating the importance of feature maps in each layer and selectively retaining the most informative features. The following are the key points of the paper: 1. **Problem Background**: - Convolutional Neural Networks (CNNs) perform excellently in image classification tasks, but due to their large model sizes, it is difficult to deploy them on hardware with limited resources. - Existing pruning methods mainly rely on weight size or gradient information to identify unimportant connections, but these methods ignore the correlation between features in different layers. 2. **Solution**: - The author introduces Conditional Mutual Information (CMI) as an indicator to measure the importance of features. CMI can capture the shared information between input features or output labels and network layers. - A hierarchical pruning method is proposed. This method calculates the CMI value by the matrix - based Rényi α - order entropy numerical method to evaluate the importance of feature maps in each layer. - Several strategies are designed to determine the cut - off point of the CMI value, thereby deciding which features can be pruned. 3. **Experimental Verification**: - Experiments on the VGG16 architecture and the CIFAR - 10 dataset show that this method can reduce the number of filters by more than one - third with only a 0.32% loss in test accuracy. 4. **Innovation Points**: - A new CMI formula is proposed to capture the correlation between features in different layers. - A two - way pruning algorithm is developed, which can perform pruning in the forward and backward directions in parallel, further improving the pruning efficiency. In conclusion, this paper aims to develop an effective CNN pruning method by using Conditional Mutual Information, enabling deep convolutional neural networks to significantly reduce the model scale while maintaining high accuracy, so as to better adapt to resource - limited environments.