Fine-Grained Instance-Level Sketch-Based Video Retrieval

Peng Xu,Kun Liu,Tao Xiang,Timothy M. Hospedales,Zhanyu Ma,Jun Guo,Yi-Zhe Song

DOI: https://doi.org/10.1109/tcsvt.2020.3014491

IF: 5.859

2021-05-01

IEEE Transactions on Circuits and Systems for Video Technology

Abstract:Existing sketch-analysis work studies sketches depicting static objects or scenes. In this work, we propose a novel cross-modal retrieval problem of fine-grained instance-level sketch-based video retrieval (FG-SBVR), where a sketch sequence is used as a query to retrieve a specific target video instance. Compared with sketch-based still image retrieval, and coarse-grained category-level video retrieval, this is more challenging as both visual appearance and motion need to be simultaneously matched at a fine-grained level. We contribute the first FG-SBVR dataset with rich annotations. We then introduce a novel multi-stream multi-modality deep network to perform FG-SBVR under both strong and weakly supervised settings. The key component of the network is a relation module, designed to prevent model overfitting given scarce training data. We show that this model significantly outperforms a number of existing state-of-the-art models designed for video analysis.

engineering, electrical & electronic

What problem does this paper attempt to address?

The problem that this paper attempts to address is Fine-Grained Instance-Level Sketch-Based Video Retrieval (FG-SBVR). Specifically, the paper aims to use sketch sequences as queries to retrieve specific target video instances. This is more challenging than existing sketch-based static image retrieval or coarse-grained category-level video retrieval because it requires matching both visual appearance and motion information at a fine-grained level. The main contributions of the paper include: 1. Proposing a novel fine-grained instance-level sketch-based video retrieval problem. 2. Contributing the first FG-SBVR dataset with rich annotations, which includes 1,448 sketches corresponding to 528 figure skating video clips. 3. Introducing a novel multi-stream multi-modal deep network to address the FG-SBVR problem, and studying it under both strong supervision and weak supervision settings. 4. Designing a relational module to prevent model overfitting and to effectively train the model even in the case of data scarcity. Through these efforts, the paper significantly enhances the performance of existing technologies in the field of video analysis.

Fine-Grained Instance-Level Sketch-Based Video Retrieval

Fine-Grained Video Retrieval With Scene Sketches

SceneSketcher-v2: Fine-Grained Scene-Level Sketch-Based Image Retrieval Using Adaptive GCNs

SceneSketcher: Fine-Grained Image Retrieval with Scene Sketches

Sketch-Based 3D Model Retrieval via Multi-feature Fusion

Instance-Level Coupled Subspace Learning for Fine-Grained Sketch-Based Image Retrieval.

Deep Multimodal Embedding Model for Fine-grained Sketch-based Image Retrieval

Sketch-Based Retrieval in Large-Scale Image Database Via Position-Aware Silhouette Matching.

Freeview Sketching: View-Aware Fine-Grained Sketch-Based Image Retrieval

Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image Retrieval

Boosting Fine-Grained Sketch-Based Image Retrieval with Self-Supervised Learning

Deep Reinforced Attention Regression for Partial Sketch Based Image Retrieval

Cross-Modal Hierarchical Modelling for Fine-Grained Sketch Based Image Retrieval

Deep Cascaded Cross-Modal Correlation Learning for Fine-Grained Sketch-Based Image Retrieval.

Sketch-based Video Object Segmentation: Benchmark and Analysis

Single Color Sketch-Based Image Retrieval in HSV Color Space

A Novel Visual-Region-Descriptor-based Approach to Sketch-based Image Retrieval

Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR

Sketch-Based Image Retrieval with a Novel BoVW Representation.

Stacked Semantic-Guided Network for Zero-Shot Sketch-Based Image Retrieval.

Multi-Granularity Representation Learning for Sketch-based Dynamic Face Image Retrieval