Learning Feature Embedding with Strong Neural Activations for Fine-Grained Retrieval
Chen Shen,Chang Zhou,Zhongming Jin,Wenqing Chu,Rongxin Jiang,Yaowu Chen,Xian-Sheng Hua
DOI: https://doi.org/10.1145/3126686.3126708
2017-01-01
Abstract:Fine-grained object retrieval, which aims at finding objects belonging to the same sub-category as the probe object from a large database, is becoming increasingly popular because of its research and application significance. Recently, convolutional neural network (CNN) based deep learning models have achieved promising retrieval performance, as they can learn both feature representations and discriminative distance metrics jointly. Specifically, a generic method is to extract activations of the fully-connected layer as feature descriptors and simultaneously optimize classification constraints (e.g., softmax loss) and similarity constraints (e.g., triplet loss) to improve the representative capability of the features. However, the typical fully-connected layer activations are more focused on representing global attributes of the corresponding image, thus relatively less sensitive to specific local characteristics. Therefore, the features learned through these approaches in general are not sufficiently capable for retrieving fine-grained objects. To attack this issue, we propose an effective feature embedding by simultaneously encoding original global features and discriminative local features, in which the local features are extracted by exploiting strong neural activations on the last convolutional layer. We present that the novel feature embedding can dramatically enlarge the gap between inter-class variance and intra-class variance, which is the key factor to improve retrieval precision. In addition, we show our architecture can also be applied in person re-identification. Experimental results on multiple challenging benchmarks demonstrate that our method outperforms the current state-of-the-art approaches by large margins.