Multisource Remote Sensing Image Classification Based on Adaptive Pooling Transformer and Multilevel Correction

Cong Wang,Xiao Chen Shi,Feng Gao,Jun Yu Dong
DOI: https://doi.org/10.1117/12.3019581
2024-01-01
Abstract:Deep learning and neural networks have ascended to prominence as the prevailing methodologies for the computational manipulation of remote sensing imagery. The collection and utilization of single remote sensing data inevitably encounter limitations, such as the arduous task of subdividing akin characteristics within hyperspectral data. Endeavors to collaborate with multiple remote sensing data sources are intensifying, aiming to attain more refined and comprehensive observational outcomes. In this article, we propose a comprehensive system framework for the processing of multi-source remote sensing image classification, named the adaptive pooling transformer network (APTnet). Primarily, Convolutional Neural Networks (CNNs) are harnessed to extract distinctive attributes from both hyperspectral images (HSI) and synthetic aperture radar images (SAR). Furthermore, we propose a technique for the fusion of remote sensing data, known as the cross-modal correction fusion method (CMCF), enabling interactive learning and rectification of multi-source features across diverse strata. Additionally, an adaptive pooling Transformer method (APT) is put forth to fortify the existing attributes and amplify their capacity for expression. The experiments conducted on the Berlin and Augsburg datasets exhibit commendable performance, complemented by exhaustive comparative analyses with prevailing methodologies.
What problem does this paper attempt to address?