Abstract:Leveraging multi-view remote sensing images in scene classification tasks significantly enhances the precision of such classifications. This approach, however, poses challenges due to the simultaneous use of multi-view images, which often leads to a misalignment between the visual content and semantic labels, thus complicating the classification process. In addition, as the number of image viewpoints increases, the quality problem for remote sensing images further limits the effectiveness of multi-view image classification. Traditional scene classification methods predominantly employ SoftMax deep learning techniques, which lack the capability to assess the quality of remote sensing images or to provide explicit explanations for the network's predictive outcomes. To address these issues, this paper introduces a novel end-to-end multi-view decision fusion network specifically designed for remote sensing scene classification. The network integrates information from multi-view remote sensing images under the guidance of image credibility and uncertainty, and when the multi-view image fusion process encounters conflicts, it greatly alleviates the conflicts and provides more reasonable and credible predictions for the multi-view scene classification results. Initially, multi-scale features are extracted from the multi-view images using convolutional neural networks (CNNs). Following this, an asymptotic adaptive feature fusion module (AAFFM) is constructed to gradually integrate these multi-scale features. An adaptive spatial fusion method is then applied to assign different spatial weights to the multi-scale feature maps, thereby significantly enhancing the model's feature discrimination capability. Finally, an evidence decision fusion module (EDFM), utilizing evidence theory and the Dirichlet distribution, is developed. This module quantitatively assesses the uncertainty in the multi-perspective image classification process. Through the fusing of multi-perspective remote sensing image information in this module, a rational explanation for the prediction results is provided. The efficacy of the proposed method was validated through experiments conducted on the AiRound and CV-BrCT datasets. The results show that our method not only improves single-view scene classification results but also advances multi-view remote sensing scene classification results by accurately characterizing the scene and mitigating the conflicting nature of the fusion process.

What problem does this paper attempt to address?

The problems that this paper attempts to solve are the challenges encountered in multi - view remote sensing scene classification. Specifically, these problems include: 1. **Image Quality and Feature Representation Problems**: - As the number of views increases, the quality problem of remote sensing images becomes an important factor limiting the effect of multi - view image classification. Different factors such as viewing angles and illumination conditions may lead to differences in image quality, thus affecting the classification effect. - Simple network structures cannot effectively distinguish the features of remote sensing images from different views, resulting in information loss and a decline in classification accuracy. 2. **Conflict Problems in Multi - view Information Fusion**: - In the process of multi - view information fusion, conflicts will inevitably occur between aerial images and ground images. Traditional evidence fusion rules cannot effectively handle these conflicts, resulting in some evidence values violating common sense and producing deviations. - These conflicts lead to deviations in intuitive cognition and cannot provide a reasonable explanation for the final prediction results. 3. **Limitations of Traditional Methods**: - Traditional scene classification methods mainly rely on deep learning techniques such as SoftMax, lack the ability to evaluate the quality of remote sensing images, and cannot provide a clear explanation for the network's prediction results. To solve the above problems, this paper proposes a new multi - view fusion architecture - **Multi - View Evidential Decision - making Fusion Network (MVEDFN)**. The main contributions of this method are as follows: - **Enhancing the Reliability and Anti - interference Ability of Multi - view Scene Classification**: MVEDFN can process multi - view remote sensing images simultaneously and achieve end - to - end multi - view scene classification, further improving the accuracy of multi - view remote sensing scene classification. - **Reducing Information Loss and Generating More Discriminative Classification Features**: The Progressive Adaptive Feature Fusion Module (AAFFM) is proposed, which can quickly fuse multi - scale features and is helpful for subsequent multi - view scene classification. - **Evidential - based Decision - making Fusion Module (EDFM)**: By combining Dirichlet distribution dynamic evaluation and integrating multi - view feature information, it effectively alleviates the conflicts between aerial and ground image information, makes the evidence data more consistent, and thus achieves reliable classification task performance. Through experimental verification on two publicly available multi - view remote sensing image data sets, this method shows the ability to effectively integrate multi - view information and achieve higher accuracy in scene classification.

Multi-View Scene Classification Based on Feature Integration and Evidence Decision Fusion

Dynamic Convolution Covariance Network Using Multi-Scale Feature Fusion for Remote Sensing Scene Image Classification

Credible Remote Sensing Scene Classification Using Evidential Fusion on Aerial-Ground Dual-View Images

Aerial Scene Classification Via Multilevel Fusion Based on Deep Convolutional Neural Networks.

Deep Feature Fusion for High-Resolution Aerial Scene Classification

Remote Sensing Scene Classification Using Heterogeneous Feature Extraction and Multi-Level Fusion

Remote Sensing Scene Classification Based on Multi-Structure Deep Features Fusion

Multi-Scale and Multi-Network Deep Feature Fusion for Discriminative Scene Classification of High-Resolution Remote Sensing Images

Remote Sensing Scene Classification Based on Decision-Level Fusion

Pairwise constraints based multiview features fusion for scene classification

A Decision-Level Fusion Method Based On Convolutional Neural Networks For Remote Sensing Scene Classification

Decision-Level Fusion with a Pluginable Importance Factor Generator for Remote Sensing Image Scene Classification

Multi-scale fusion for few-shot remote sensing image classification

Dense Connectivity Based Two-Stream Deep Feature Fusion Framework for Aerial Scene Classification

Multi-view fusion optimization method via low-rank tensor decomposition for remote sensing image classification

Scene Classification Based on Heterogeneous Features of Multi-Source Data

An Adaptive Multilayer Feature Fusion Strategy for Remote Sensing Scene Classification

Scene classification for remote sensing image of land use and land cover using dual-model architecture with multilevel feature fusion

Feature and Model Level Fusion of Pretrained CNN for Remote Sensing Scene Classification

MCAFNet: Multi-Channel Attention Fusion Network-Based CNN For Remote Sensing Scene Classification

Multilayer Feature Fusion Network With Spatial Attention and Gated Mechanism for Remote Sensing Scene Classification