CNN Vs. SIFT for Image Retrieval: Alternative or Complementary?

Ke Yan,Yaowei Wang,Dawei Liang,Tiejun Huang,Yonghong Tian
DOI: https://doi.org/10.1145/2964284.2967252
2016-01-01
Abstract:In the past decade, SIFT is widely used in most vision tasks such as image retrieval. While in recent several years, deep convolutional neural networks (CNN) features achieve the state-of-the-art performance in several tasks such as image classification and object detection. Thus a natural question arises: for the image retrieval task, can CNN features substitute for SIFT? In this paper, we experimentally demonstrate that the two kinds of features are highly complementary. Following this fact, we propose an image representation model, complementary CNN and SIFT (CCS), to fuse CNN and SIFT in a multi-level and complementary way. In particular, it can be used to simultaneously describe scene-level, object-level and point-level contents in images. Extensive experiments are conducted on four image retrieval benchmarks, and the experimental results show that our CCS achieves state-of-the-art retrieval results.
What problem does this paper attempt to address?