FSNet: Compression of Deep Convolutional Neural Networks by Filter Summary

Yingzhen Yang,Jiahui Yu,Nebojsa Jojic,Jun Huan,Thomas S. Huang
DOI: https://doi.org/10.48550/arXiv.1902.03264
2020-04-10
Abstract:We present a novel method of compression of deep Convolutional Neural Networks (CNNs) by weight sharing through a new representation of convolutional filters. The proposed method reduces the number of parameters of each convolutional layer by learning a 1D vector termed Filter Summary (FS). The convolutional filters are located in FS as overlapping 1D segments, and nearby filters in FS share weights in their overlapping regions in a natural way. The resultant neural network based on such weight sharing scheme, termed Filter Summary CNNs or FSNet, has a FS in each convolution layer instead of a set of independent filters in the conventional convolution layer. FSNet has the same architecture as that of the baseline CNN to be compressed, and each convolution layer of FSNet has the same number of filters from FS as that of the basline CNN in the forward process. With compelling computational acceleration ratio, the parameter space of FSNet is much smaller than that of the baseline CNN. In addition, FSNet is quantization friendly. FSNet with weight quantization leads to even higher compression ratio without noticeable performance loss. We further propose Differentiable FSNet where the way filters share weights is learned in a differentiable and end-to-end manner. Experiments demonstrate the effectiveness of FSNet in compression of CNNs for computer vision tasks including image classification and object detection, and the effectiveness of DFSNet is evidenced by the task of Neural Architecture Search.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the excessive number of parameters and the overly high computational complexity in deep convolutional neural networks (CNNs). Specifically, due to the large number of parameters, traditional CNN models are difficult to deploy on mobile devices or other platforms with limited storage resources, and the computational cost is high during the inference process. In addition, the large parameter space also prompts researchers to explore regularization methods to prevent over - fitting. To solve these problems, the author proposes a new method for compressing CNNs, that is, achieving weight sharing through Filter Summary (FS). This method reduces the number of parameters in each convolutional layer by learning a one - dimensional vector FS to represent the filters in the convolutional layer. Convolutional filters exist in FS in the form of overlapping one - dimensional segments, and adjacent filters naturally share weights. The neural network constructed based on this weight - sharing scheme is called Filter Summary CNNs or FSNet. ### Specific Objectives: 1. **Reduce the Number of Parameters**: By introducing Filter Summary (FS), the filters in the convolutional layer can share weights, thereby significantly reducing the number of parameters in the model. 2. **Accelerate Computation**: Propose the fast convolution algorithm Fast Convolution by Filter Summary (FCFS), which takes advantage of the overlapping structure of filters in FS to avoid unnecessary repeated calculations and improve the speed of convolution operations. 3. **Quantization - Friendly**: FSNet is friendly to quantization, further improving the compression ratio while maintaining high performance. 4. **End - to - End Differentiable**: Propose Differentiable FSNet (DFSNet), in which the way filters share weights can be learned in an end - to - end manner, thereby finding a better weight - sharing pattern. Through these improvements, FSNet can not only significantly reduce the model size while maintaining a high accuracy rate, but also effectively accelerate the inference process, and is especially suitable for scenarios requiring efficient deployment.