Abstract:<span>Remote sensing scene classification has gained increasing interest in remote sensing image understanding and feature representation is the crucial factor for classification methods. Convolutional Neural Network (CNN) generally uses hierarchical deep structure to automatically learn the feature representation from the whole images and thus has been widely applied in scene classification. However, it may fail to consider the discriminative components within the image during the learning process. Moreover, the potential relationships of scene semantics are likely to be ignored. In this paper, we present a novel remote sensing scene classification method based on high-order graph convolutional network (H-GCN). Our method uses the attention mechanism to focus on the key components inside the image during CNN feature learning. More importantly, high-order graph convolutional network is applied to investigate the class dependencies. The graph structure is built where each node is described by the mean of attentive CNN features from each semantic class. The semantic class dependencies are propagated with mixing neighbor information of nodes at different orders and thus the more informative representation of nodes can be gained. The node representations of H-GCN and attention CNN features are finally integrated as the discriminative feature representation for scene classification. Experimental results on benchmark datasets demonstrate the feasibility and effectiveness of our method for remote sensing scene classification.</span>

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the feature representation problem in remote sensing scene classification. Specifically: 1. **Limitations of traditional CNN methods**: Although convolutional neural networks (CNN) are excellent at automatically learning high - level features from images, they have two main weaknesses when dealing with remote sensing scene classification: - **Difficulty in distinguishing key components**: Complex remote sensing scene images may contain similar components (for example, both forests and parks may have trees), which makes CNN - based methods difficult to distinguish these key components. - **Ignoring semantic relationships**: Traditional CNN methods usually ignore the potential relationships between scene semantics, which may lead to poor classification results. 2. **Necessity of introducing the attention mechanism**: Although the attention mechanism can enhance the focus on key regions in the image to a certain extent, when the regions of interest in different scene categories have similar features, relying solely on the attention mechanism may not be sufficient to effectively distinguish these categories. For example, "dense residential areas", "medium - density residential areas" and "schools" may have similar objects of interest (such as houses), which makes it difficult to accurately classify only by attention features. 3. **Exploring the correlation of semantic categories**: In order to further improve the classification performance, the author believes that the correlation between different scene semantic categories should be explored. Theoretically, semantically similar scene categories should have more similar feature representations, and vice versa. Therefore, the author hopes to improve the ability of feature representation by modeling the correlation of these semantic categories. To solve the above problems, the author proposes a new remote sensing scene classification method based on the high - order graph convolutional network (H - GCN). This method combines the attention mechanism and CNN features, aiming to: - **Focus on key components**: Focus on the key components in the image through the attention mechanism, thereby enhancing the discriminative ability of CNN features. - **Model semantic correlations**: Build a graph structure to model the dependencies between different semantic categories, and use the high - order graph convolutional network to propagate these relationships, generating more informative node feature representations. Finally, this method improves the performance of remote sensing scene classification by integrating the attention - CNN features and H - GCN node representations. Experimental results show that this method outperforms existing methods on multiple public datasets. ### Formula summary 1. **Global average pooling in the attention mechanism**: \[ z_c=\frac{1}{H\times W}\sum_{i = 1}^{H}\sum_{j = 1}^{W}u_{c}(i,j) \] where \(z_c\) is the global average value of the \(c\) - th channel. 2. **Channel - dependency modeling**: \[ s=\sigma(f_c(\delta(f_c(z)))) \] where \(f_c(\cdot)\) represents a fully - connected layer, and \(\delta(\cdot)\) and \(\sigma(\cdot)\) are the ReLU and Sigmoid activation functions respectively. 3. **Weighted feature map**: \[ \bar{u}_{c}(i,j)=s_c\cdot u_{c}(i,j) \] 4. **Definition of the adjacency matrix in the graph structure**: \[ A_{ij}=\begin{cases}1, & \text{if }X_i\in\text{knn}(X_j)\text{ or }X_j\in\text{knn}(X_i)\\0, & \text{otherwise}\end{cases} \] 5. **Message - passing rule of the high - order graph convolutional network**: \[ H^{(l + 1)}=\bigoplus_{j\in P}\sigma(\tilde{A}^jH^{(l)}W_l^{(j)}) \] where \(\tilde{A}^j\) represents the \(j\) - th power of the adjacency matrix \(\tilde{A}\)

Remote sensing scene classification based on high-order graph convolutional network

A lightweight and stochastic depth residual attention network for remote sensing scene classification

Remote Sensing Scene Classification by Gated Bidirectional Network

Scene Classification Of High Resolution Remote Sensing Images Using Convolutional Neural Networks

Optimizing Spatial Relationships in GCN to Improve the Classification Accuracy of Remote Sensing Images

Multi-Output Network Combining GNN and CNN for Remote Sensing Scene Classification

Multi-Label Remote Sensing Image Scene Classification by Combining a Convolutional Neural Network and a Graph Neural Network

Remote Sensing Image Scene Classification Based on Global Self-Attention Module

Hyperspectral Image Classification Based on Fusion of Convolutional Neural Network and Graph Network

Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery

Robust dense graph structure based on graph convolutional network for hyperspectral image classification

Semi-Supervised Tree Species Classification for Multi-Source Remote Sensing Images Based on a Graph Convolutional Neural Network

Multi-view Graph Convolutional Network with Spectral Component Decompose for Remote Sensing Images Classification

Multi-Level Graph Convolutional Network with Automatic Graph Learning for Hyperspectral Image Classification

Graph Convolutional Network With Relaxed Collaborative Representation for Hyperspectral Image Classification

Relation-Attention Networks for Remote Sensing Scene Classification

Integrating Multilayer Features of Convolutional Neural Networks for Remote Sensing Scene Classification.

Channel-Attention-Based DenseNet Network for Remote Sensing Image Scene Classification

Context Spatial Awareness Remote Sensing Image Change Detection Network Based on Graph and Convolution Interaction

Remote sensing scene classification based on contextual attention mechanism of lie group space

CNN-Enhanced Graph Convolutional Network With Pixel- and Superpixel-Level Feature Fusion for Hyperspectral Image Classification