Abstract:Although deep learning has revolutionized remote sensing (RS) image scene classification, current deep learning-based approaches highly depend on the massive supervision of predetermined scene categories and have disappointingly poor performance on new categories that go beyond predetermined scene categories. In reality, the classification task often has to be extended along with the emergence of new applications that inevitably involve new categories of RS image scenes, so how to make the deep learning model own the inference ability to recognize the RS image scenes from unseen categories, which do not overlap the predetermined scene categories in the training stage, becomes incredibly important. By fully exploiting the RS domain characteristics, this paper constructs a new remote sensing knowledge graph (RSKG) from scratch to support the inference recognition of unseen RS image scenes. To improve the semantic representation ability of RS-oriented scene categories, this paper proposes to generate a Semantic Representation of scene categories by representation learning of RSKG (SR-RSKG). To pursue robust cross-modal matching between visual features and semantic representations, this paper proposes a novel deep alignment network (DAN) with a series of well-designed optimization constraints, which can simultaneously address zero-shot and generalized zero-shot RS image scene classification. Extensive experiments on one merged RS image scene dataset, which is the integration of multiple publicly open datasets, show that the recommended SR-RSKG obviously outperforms the traditional knowledge types (e.g., natural language processing models and manually annotated attribute vectors), and our proposed DAN shows better performance compared with the state-of-the-art methods under both the zero-shot and generalized zero-shot RS image scene classification settings. The constructed RSKG will be made publicly available along with this paper (https://github.com/kdy2021/SR-RSKG).

DEEP ZERO-SHOT LEARNING FOR SCENE SKETCH

Joint Learning of Attended Zero-Shot Features and Visual-Semantic Mapping.

SceneSketcher: Fine-Grained Image Retrieval with Scene Sketches

SceneSketcher-v2: Fine-Grained Scene-Level Sketch-Based Image Retrieval Using Adaptive GCNs

Stacked Semantic-Guided Network for Zero-Shot Sketch-Based Image Retrieval.

Learning a Deep Embedding Model for Zero-Shot Learning

Domain-Smoothing Network for Zero-Shot Sketch-Based Image Retrieval

Towards Effective Deep Embedding for Zero-Shot Learning

Open Vocabulary Semantic Scene Sketch Understanding

On Learning Semantic Representations for Large-Scale Abstract Sketches

Progressive Cross-Modal Semantic Network for Zero-Shot Sketch-Based Image Retrieval

DeepSSN: A deep convolutional neural network to assess spatial scene similarity

Zero-shot sketch-based image retrieval via adaptive relation-aware metric learning

Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style

Three-Stream Joint Network for Zero-Shot Sketch-Based Image Retrieval

Zero-Shot Scene Classification for High Spatial Resolution Remote Sensing Images

Robust deep alignment network with remote sensing knowledge graph for zero-shot and generalized zero-shot remote sensing image scene classification

Learning discriminative visual semantic embedding for zero-shot recognition

Zero-Shot Sketch-Based Image Retrieval via Graph Convolution Network

Transferable Coupled Network for Zero-Shot Sketch-Based Image Retrieval

3D Sketch-aware Semantic Scene Completion Via Semi-supervised Structure Prior