Quartet-net Learning for Visual Instance Retrieval

Jiewei Cao,Zi Huang,Peng Wang,Chao Li,Xiaoshuai Sun,Heng Tao Shen
DOI: https://doi.org/10.1145/2964284.2967262
2016-01-01
Abstract:Recently, neuron activations extracted from a pre-trained convolutional neural network (CNN) show promising performance in various visual tasks. However, due to the domain and task bias, using the features generated from the model pre-trained for image classification as image representations for instance retrieval is problematic. In this paper, we propose quartet-net learning to improve the discriminative power of CNN features for instance retrieval. The general idea is to map the features into a space where the image similarity can be better evaluated. Our network differs from the traditional Siamese-net in two ways. First, we adopt a double-margin contrastive loss with a dynamic margin tuning strategy to train the network which leads to more robust performance. Second, we introduce in the mimic learning regularization to improve the generalization ability of the network by preventing it from overfitting to the training data. Catering for the network learning, we collect a large-scale dataset, namely GeoPair, which consists of 68k matching image pairs and 63k non-matching pairs. Experiments on several standard instance retrieval datasets demonstrate the effectiveness of our method.
What problem does this paper attempt to address?