Supplementary Material: Dynamic Group Convolution for Accelerating Convolutional Neural Networks

Zhuo Su,Linpu Fang,Wenxiong Kang,Dewen Hu,Matti Pietikäinen,Li Liu
2020-01-01
Abstract:1.1 Detailed derivations and Updating strategy Let x ∈ RC×H×W , x′ ∈ RC×H×W ′ be the input and output of a particular DGC layer and the pruning rate is denoted as ξ. For a dynamic group convolution (DGC) network, the head-wise threshold makes sure each head in the network exactly selects a certain number of channels according to the target pruning rate ξ after training, i.e., (1− ξ)C channels are selected from the input volume x (see Eq. 6 in the paper). While the global threshold T makes DGC structures more flexible allowing an uneven channel selection among heads within any DGC layer, while at the same time keeping the average pruning rate of the whole structure meeting the target ξ with tiny deviation. To obtain T , firstly, all saliency vectors throughout the network are collected and concatenated as a single saliency vector G:
What problem does this paper attempt to address?