Graph-based Reasoning Attention Pooling with Curriculum Design for Content-Based Image Retrieval

Xiaoguang Zhu,Haoyu Wang,Peilin Liu,Zhantao Yang,Jiuchao Qian
DOI: https://doi.org/10.1016/j.imavis.2021.104289
IF: 3.86
2021-01-01
Image and Vision Computing
Abstract:Global single-pass methods have shown superior efficiency over local aggregation methods on content-based image retrieval. However, they tend to fail under challenging environments since the structural relations among regions are not exploited. To address this issue, we propose a novel Graph-based Reasoning Attention Pooling with Curriculum Design (GRAP-CD) to improve the network capability through training modification and trainable pooling. GRAP-CD can not only explore relations among salient regions but also gradually train the network to achieve better local minima. The graph-based reasoning layers regard the feature map from the last convolution layer as a graph and construct the structural relations. Then the graph-based attention layer enhances the key information guided by the relations. Besides, a front-end curriculum design is introduced to split the training dataset from simple to complex and train the model step by step, which further helps the GRAP firstly learn the basic feature information from simple samples and then learn to dig the more representative features with hard positive samples. Experimental results on popular benchmarks ROxford and RParis datasets achieve improvement over state-of-the-art global single-pass methods and competitive results with local aggregation methods.
What problem does this paper attempt to address?