Abstract:Visual retrieval tasks such as image retrieval and person re-identification (Re-ID) aim at effectively and thoroughly searching images with similar content or the same identity. After obtaining retrieved examples, re-ranking is a widely adopted post-processing step to reorder and improve the initial retrieval results by making use of the contextual information from semantically neighboring samples. Prevailing re-ranking approaches update distance metrics and mostly rely on inefficient crosscheck set comparison operations while computing expanded neighbors based distances. In this work, we present an efficient re-ranking method which refines initial retrieval results by updating features. Specifically, we reformulate re-ranking based on Graph Convolution Networks (GCN) and propose a novel Graph Convolution based Re-ranking (GCR) for visual retrieval tasks via feature propagation. To accelerate computation for large-scale retrieval, a decentralized and synchronous feature propagation algorithm which supports parallel or distributed computing is introduced. In particular, the plain GCR is extended for cross-camera retrieval and an improved feature propagation formulation is presented to leverage affinity relationships across different cameras. It is also extended for video-based retrieval, and Graph Convolution based Re-ranking for Video (GCRV) is proposed by mathematically deriving a novel profile vector generation method for the tracklet. Without bells and whistles, the proposed approaches achieve state-of-the-art performances on seven benchmark datasets from three different tasks, i.e., image retrieval, person Re-ID and video-based person Re-ID.

Pyramid regional graph representation learning for content-based video retrieval

Unsupervised Teacher-Student Model for Large-Scale Video Retrieval.

VRAG: Region Attention Graphs for Content-Based Video Retrieval

Spatial-temporal Graphs for Cross-modal Text2Video Retrieval

Explicit and implicit concept-based video retrieval with bipartite graph propagation model.

Multi-Modal Relational Graph for Cross-Modal Video Moment Retrieval

Large-Scale Video Retrieval Via Deep Local Convolutional Features.

Deep Representation-Based Fuzzy Graph Model for Content-Based Image Retrieval

Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning

Proposal-Free Video Grounding with Contextual Pyramid Network

Graph Convolution Based Efficient Re-Ranking for Visual Retrieval

Graph-based Reasoning Attention Pooling with Curriculum Design for Content-Based Image Retrieval

An Effective Video Retrieval Approach Based on Multi-modality Concept Correlation Graph

Multi-Dimensional Attentive Hierarchical Graph Pooling Network for Video-Text Retrieval.

Video Data Hierarchical Retrieval Via Deep Hash Method

Efficient Retrieval from Large-Scale Egocentric Visual Data Using a Sparse Graph Representation

Graph-based Multi-Space Semantic Correlation Propagation for Video Retrieval

Video Relation Detection with Spatio-Temporal Graph

Relational Reasoning over Spatial-Temporal Graphs for Video Summarization

Neighborhood Preserving Hashing for Scalable Video Retrieval.

Video Captioning with Aggregated Features Based on Dual Graphs and Gated Fusion