Abstract:Effectively and efficiently retrieving images from remote-sensing databases is a critical challenge in the realm of remote-sensing big data. Utilizing hand-drawn sketches as retrieval inputs offers intuitive and user-friendly advantages, yet the potential of multi-level feature integration from sketches remains underexplored, leading to suboptimal retrieval performance. To address this gap, our study introduces a novel zero-shot, sketch-based retrieval method for remote-sensing images, leveraging multi-level feature extraction, self-attention-guided tokenization and filtering, and cross-modality attention update. This approach employs only vision information and does not require semantic knowledge concerning the sketch and image. It starts by employing multi-level self-attention guided feature extraction to tokenize the query sketches, as well as self-attention feature extraction to tokenize the candidate images. It then employs cross-attention mechanisms to establish token correspondence between these two modalities, facilitating the computation of sketch-to-image similarity. Our method significantly outperforms existing sketch-based remote-sensing image retrieval techniques, as evidenced by tests on multiple datasets. Notably, it also exhibits robust zero-shot learning capabilities in handling unseen categories and strong domain adaptation capabilities in handling unseen novel remote-sensing data. The method's scalability can be further enhanced by the pre-calculation of retrieval tokens for all candidate images in a database. This research underscores the significant potential of multi-level, attention-guided tokenization in cross-modal remote-sensing image retrieval. For broader accessibility and research facilitation, we have made the code and dataset used in this study publicly available online.

Semantic Illustration Retrieval for Very Large Data Set

Semantic Image Retrieval Based on Multiple-Instance Learning

Research on high-level semantic image retrieval

Modeling Image Data for Effective Indexing and Retrieval in Large General Image Databases.

SceneSketcher: Fine-Grained Image Retrieval with Scene Sketches

A Semantic-Based Method for Visualizing Large Image Collections.

Real-Time Image Semantic Retrieval Based on VQ

ImageSaker : A Semantic-based Image Retrieval System Refining with Concept Model

Robust Semantic Sketch Based Specific Image Retrieval

An Automatic Illustration Recommendation System

Image Retrieval Model Providing Semantics and Visual-Features-based Query for Users

Allocating images and selecting image collections for distributed visual search

A new image retrieval system supporting query by semantics and example

A Systematic Approach to Semantics-Based Image Retrieval and Organization Using Thesaurus1

Enhancing Remote Sensing Image Retrieval: A Hierarchical Approach Integrating Visual and Semantic Similarities

Semantic Retrieval of Remote Sensing Images Based on the Bag-of-Words Association Mapping Method

Image Annotation by Large-Scale Content-Based Image Retrieval

Zero-Shot Sketch-Based Remote-Sensing Image Retrieval Based on Multi-Level and Attention-Guided Tokenization

ICICLE: A Semantic-Based Retrieval System for WWW Images

Query-by-sketch Semantic Object Image Retrieval

An Image Retrieval And Semi-Automatic Annotation Scheme For Large Image Databases On The Web