Abstract:Very deep convolutional neural networks (CNNs) have been firmly established as the primary methods for many computer vision tasks. However, most state-of-the-art CNNs are large, which results in high inference latency. Recently, depth-wise separable convolution has been proposed for image recognition tasks on computationally limited platforms such as robotics and self-driving cars. Though it is much faster than its counterpart, regular convolution, accuracy is sacrificed. In this paper, we propose a novel decomposition approach based on SVD, namely depth-wise decomposition, for expanding regular convolutions into depthwise separable convolutions while maintaining high accuracy. We show our approach can be further generalized to the multi-channel and multi-layer cases, based on Generalized Singular Value Decomposition (GSVD) [59]. We conduct thorough experiments with the latest ShuffleNet V2 model [47] on both random synthesized dataset and a large-scale image recognition dataset: ImageNet [10]. Our approach outperforms channel decomposition [73] on all datasets. More importantly, our approach improves the Top-1 accuracy of ShuffleNet V2 by ~2%.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to accelerate depth - wise separable convolutions in convolutional neural networks (CNNs) while maintaining high precision. Specifically, the author proposes a novel decomposition method based on singular value decomposition (SVD), namely Depth - wise Decomposition, which is used to expand the regular convolution into depth - wise separable convolution, thereby minimizing performance loss while reducing the amount of computation. ### Background and Motivation As very deep convolutional neural networks (CNNs) have achieved remarkable results in many computer vision tasks, these models are usually large - scale, resulting in high inference latency. For platforms with limited computing resources (such as robots, self - driving cars, etc.), although depth - wise separable convolution is much faster than regular convolution, its accuracy is sacrificed. Therefore, how to accelerate depth - wise separable convolution while maintaining high precision has become an important research direction. ### Solution The author proposes the Depth - wise Decomposition method, which decomposes the regular convolution layer into depth - wise separable convolution layer through SVD while minimizing performance loss. The specific steps are as follows: 1. **Decomposition of Single - Channel Convolution Layer**: - For the convolution layer of a single channel, decompose its weight tensor \(W\) into depth - convolution weight \(D\) and point - convolution weight \(P\). - Use SVD to decompose the output tensor \(Y\) to obtain the projection vector \(V_0\). - Construct the depth - convolution weight \(D = V_0W\) and the point - convolution weight \(P = V_0\). 2. **Decomposition of Multi - Channel Convolution Layer**: - Apply the single - channel algorithm to each channel respectively to obtain multiple depth - convolution weights \(D_i\) and point - convolution weights \(P_i\). - In order to reduce the error of multi - channel decomposition, a cross - channel error compensation mechanism is introduced, and the generalized singular value decomposition (GSVD) is used to consider the error of the previous channel. 3. **Decomposition of Multi - Layer Network**: - Apply the depth - decomposition method layer by layer and consider the accumulated error in the multi - layer decomposition process. - Decompose by extracting all feature map blocks \(Y'\) because the feature map response will change during the decomposition process. ### Experimental Results The author has carried out extensive experiments on the ImageNet dataset to verify the effectiveness of the proposed method: - **Single - Layer Acceleration**: In the single - layer 9 - fold acceleration experiment on random data, the performance of the Depth - wise Decomposition method is comparable to that of the channel decomposition method, and it performs better on different convolution layers. - **Whole - Model Compression**: On the ShuffleNet V2 model, the Depth - wise Decomposition method achieves about a 2% Top - 1 accuracy improvement without reducing the precision. - **Generalization Ability**: The experimental results on other ShuffleNet V2 architectures and Xception models are also consistent, indicating that this method has good generalization ability. ### Conclusion Through the Depth - wise Decomposition method, the author has successfully accelerated depth - wise separable convolution while maintaining high precision, providing an effective solution for platforms with limited computing resources. The experimental results of this method on multiple models and datasets have proved its effectiveness and superiority.

Depth-wise Decomposition for Accelerating Separable Convolutions in Efficient Convolutional Neural Networks

Network Decoupling: From Regular to Depthwise Separable Convolutions

Binocular Depth Estimation Using Convolutional Neural Network With Siamese Branches.

Real-time Semantic Segmentation with Weighted Factorized-Depthwise Convolution

A Depthwise Separable Convolution Hardware Accelerator for ShuffleNetV2

Optimizing Depthwise Separable Convolution Operations on GPUs

XSepConv: Extremely Separated Convolution

Sparsing Deep Neural Network Using Semi-Discrete Matrix Decomposition

XSepConv: Extremely Separated Convolution for Efficient Deep Networks with Large Kernels

A Cnn-Based Depth Estimation Approach With Multi-Scale Sub-Pixel Convolutions And A Smoothness Constraint

A High-speed Low-cost CNN Inference Accelerator for Depthwise Separable Convolution

Speeding Up Deep Convolutional Neural Networks Based on Tucker-CP Decomposition

Sparse Kronecker Canonical Polyadic Decomposition for Convolutional Neural Networks Compression

Depthwise Multiception Convolution for Reducing Network Parameters without Sacrificing Accuracy

InDeed: Interpretable image deep decomposition with guaranteed generalizability

Learning Depthwise Separable Graph Convolution from Data Manifold

3D Depthwise Convolution: Reducing Model Parameters in 3D Vision Tasks

Holistic Decomposition Convolution for Effective Semantic Segmentation of Medical Volume Images.

Collaborative Deconvolutional Neural Networks for Joint Depth Estimation and Semantic Segmentation

MobileXNet: An Efficient Convolutional Neural Network for Monocular Depth Estimation

Accelerating Depthwise Separable Convolutions on Ultra-Low-Power Devices