Rethinking the Value of Local Feature Fusion in Convolutional Neural Networks

Zhenyu Lou,Xin Ye,Luoming Zhang,Weijia Wu,Yefei He,Hong Zhou
DOI: https://doi.org/10.1007/s11063-023-11193-5
IF: 2.565
2023-01-01
Neural Processing Letters
Abstract:Traditional CNN head for classification tasks typically consists of a global average pooling layer before the last fully-connected classifier. However, such a simple and light-weighted head lacks the ability of feature fusion, and can’t give full play to the strong feature extraction ability of the network body. In the present work, we analyze the Basic Block and Bottleneck structure in ResNet in-depth and reveal the importance of performing feature fusion inside local patches via 1× 1 convolution. We propose a new head structure consisting of three stages with a series of 1× 1 convolution to replace global average pooling. With little additional FLOPs and inference speed drop, our new head improves the accuracy for ResNet18 by 3.6
What problem does this paper attempt to address?