Alignment and Fusion Using Distinct Sensor Data for Multimodal Aerial Scene Classification

Xu Sun,Junyu Gao,Yuan Yuan
DOI: https://doi.org/10.1109/tgrs.2024.3406697
IF: 8.2
2024-06-12
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Multimodal sensors offer a wealth of rich and diverse data, which is helpful for classifying similar and complex aerial scenes. However, the heterogeneity of the data collected from different sensors brings a great challenge for alignment and fusion. For this, we present a multimodal aerial scene classification approach for extracting distinct modal information representations, realizing alignment and fusion of semantic information at both the data and feature levels. First, an adaptive zero-crossing rate (AZCR) module is proposed to convert the sequential data into images, achieving alignment at the data level. This module is proficient at extracting temporal and frequency domain features from sequential data through adaptive parameter adjustments. Second, we propose a multimodal alignment and fusion (MMAF) module to facilitate the alignment and fusion of distinct data, thereby achieving comprehensive modality integration at the feature level. Finally, the multimodal alignment loss (ALIGN Loss) function is designed to assess the alignment outcomes and constrain the training process. Our approach has been proven effective in accurately classifying aerial scenes, as demonstrated by the results of our experiments on two public datasets. The proposed method achieves 81.32% and 59.80% F1 score on the AuDio Visual Aerial sceNe reCognition datasEt (ADVANCE) and urban region function classification (URFC) datasets.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics
What problem does this paper attempt to address?