Delving into Fully Convolutional Networks Activations for Visual Recognition.

Longfei Zhang,Yanming Guo
DOI: https://doi.org/10.1145/3195588.3195604
2018-01-01
Abstract:Convolutional Neural Networks (CNNs) have attracted significant attention in visual recognition. To fully exploit the potential advantages of the CNN models for image classification, this paper introduces several new ideas, ranging from feature generation to classifier selection. We start by transforming the existing CNN models into fully convolutional networks (FCNs). This eliminates the restriction of image resolution and improves the efficiency for data augmentation. Next, we propose a cross pooling strategy to aggregate the top-layer activations, which combines the advantages of average pooling and max pooling, without enlarging the feature dimension. Moreover, we utilize the regularized logistic regression classifier for image classification, and demonstrate that it cooperates better with our feature than the commonly used linear SVM, both in terms of the accuracy and efficiency. By conducting extensive experiments on four frequently benchmarked datasets, we find our proposed scheme usually has better performance relative to other leading approaches which utilize similar-size features, while delivering comparable results with the state-of-the-art methods which depend on much larger dimensional features.
What problem does this paper attempt to address?