Design of A Backbone Without Pretraining

Shaoqi Hou,Wenyi Du,Yiyin Ding,Yuhao Zeng,Chunyu Wang,Guangqiang Yin
DOI: https://doi.org/10.1109/icicip53388.2021.9642216
2021-01-01
Abstract:The excellent performance of deep learning depends on the strong representation ability of its backbone. As a conventional means of most backbones, pretraining can make the model obtain high accuracy, but it also brings some disadvantages that can not be ignored: first, the structures of the backbones need pretraining are fixed, they are difficult to modify and migrate across tasks; second, the pretraining process needs to consume huge computing power. To solve this problem, we propose a backbone named RVNet (Residual VGGNet), which can make the model converge quickly without pretraining. The design of RVNet is divided into the following two steps: firstly, the residual convolutional layer (RCL) is designed by referring to the residual skill and BN layer, which can prevent the gradient from disappearing and restrain the data distribution. At the same time, The introduced 1* 1 convolution layer can improve the nonlinearity of the model while controlling the number of feature maps’ channels; then, based on VGGNet-19, the designed RCLs replace the original 3* 3 convolution layer to improve the representation ability of the backbone. We take the person re-identification (Re-ID) task as the research object, and prove the effectiveness and superiority of RVNet through a series of ablation experiments.
What problem does this paper attempt to address?