Abstract:Due to the proliferation of large-scale remote-sensing (RS) archives with multiple annotations, multilabel RS scene classification and retrieval are becoming increasingly popular. Although some recent deep learning-based methods are able to achieve promising results in this context, the lack of research on how to learn embedding spaces under the multilabel assumption often makes these models unable to preserve complex semantic relations pervading aerial scenes, which is an important limitation in RS applications. To fill this gap, we propose a new graph relation network (GRN) for multilabel RS scene categorization. Our GRN is able to model the relations between samples (or scenes) by making use of a graph structure which is fed into network learning. For this purpose, we define a new loss function called scalable neighbor discriminative loss with binary cross entropy (SNDL-BCE) that is able to embed the graph structures through the networks more effectively. The proposed approach can guide deep learning techniques (such as convolutional neural networks) to a more discriminative metric space, where semantically similar RS scenes are closely embedded and dissimilar images are separated from a novel multilabel viewpoint. To achieve this goal, our GRN jointly maximizes a weighted leave-one-out $K$ -nearest neighbors ( $K$ NN) score in the training set, where the weight matrix describes the contributions of the nearest neighbors associated with each RS image on its class decision, and the likelihood of the class discrimination in the multilabel scenario. An extensive experimental comparison, conducted on three multilabel RS scene data archives, validates the effectiveness of the proposed GRN in terms of $K$ NN classification and image retrieval. The codes of this article will be made publicly available for reproducible research in the community.

DAG-Recurrent Neural Networks for Scene Labeling.

Scene Segmentation with DAG-Recurrent Neural Networks

Scene Labeling Using Recurrent Neural Networks with Explicit Long Range Contextual Dependency.

Convolutional Neural Networks with Intra-Layer Recurrent Connections for Scene Labeling

Scene Labeling using Gated Recurrent Units with Explicit Long Range Conditioning

Scene Parsing via Dense Recurrent Neural Networks with Attentional Selection

Multimodal Recurrent Neural Networks with Information Transfer Layers for Indoor Scene Labeling

Recurrent Attentional Reinforcement Learning for Multi-label Image Recognition

Deep Neural Network with Attention Model for Scene Text Recognition.

RGB-D Scene Labeling with Long Short-Term Memorized Fusion Model.

Discriminative Feature Learning and Region Consistency Activation for Robust Scene Labeling

Context-based conditional random fields as recurrent neural networks for image labeling

Attention-Based Recurrent Neural Network for Sequence Labeling.

Multi-label Image Recognition by Recurrently Discovering Attentional Regions

A Hierarchical Recurrent Approach To Predict Scene Graphs From A Visual-Attention-Oriented Perspective

Outdoor Scene Labeling Using Deep Convolutional Neural Networks

Fast scene labeling via structural inference

Pixel DAG-Recurrent Neural Network for Spectral-Spatial Hyperspectral Image Classification

Graph Relation Network: Modeling Relations Between Scenes for Multilabel Remote-Sensing Image Classification and Retrieval

Dictionary Learning Inspired Deep Network for Scene Recognition

Adaptive local recalibration network for scene recognition