Interactive transformer and CNN network for fusion classification of hyperspectral and LiDAR data
Leiquan Wang Wenwen Liu Dong Lyu Peiying Zhang Fangming Guo Yabin Hu Mingming Xu a Qingdao Institute of Software,College of Computer Science and Technology,China University of Petroleum (East China),Qingdao,Chinab College of Oceanography and Space Informatics,China University of Petroleum (East China),Qingdao,Chinac Lab of Marine Physics and Remote Sensing,First Institute of Oceanography,Ministry of Natural Resources,Qingdao,ChinaLeiquan Wang received the Ph.D. degree in Communication and Electrical Systems from BUPT. Now he is a lecturer in the College of Computer Science and Technology,China University of Petroleum (East China). His current research interests include multimodal fusion,image/video caption and hyperspectral analysis.Wenwen Liu is a postgraduate student in the college of College of Computer Science and Technology,China University of Petroleum (East China). Her current research interests include fusion classification of HSI and LiDAR Data.Dong Lyu is currently an undergraduate student at China University of Petroleum (East China). His current research interests include fusion classification of HSI and LiDAR Data.Peiying Zhang is currently an Associate Professor with the College of Computer Science and Technology,China University of Petroleum (East China). He received his Ph.D. in the School of Information and Communication Engineering at University of Beijing University of Posts and Telecommunications in 2019. He has published multiple IEEE/ACM Trans./Journal/Magazine papers since 2016,such as IEEE TII,IEEE T-ITS,IEEE TVT,IEEE TNSE,IEEE TNSM,IEEE TETC,IEEE Network and etc. He served as the Technical Program Committee of AAAI'24,AAAI'23,IEEE ICC'23,IEEE ICC'22,and INFOCOM Wireless-Sec 2023. He is the Leading Guest Editor of Drones,Mathematics,Electronics,Wireless Communications and Mobile Computing,and etc. He is the editorial board of Drones,CMC-Computers,Materials \& Continua,Mobile Information Systems,International Journal of Computational Intelligence Systems and Artificial Intelligence and Applications (AIA). His research interests include semantic computing,future internet architecture,network virtualization,and artificial intelligence for networking.Fangming Guo received the B.E. degree in computer science and technology from China University of Petroleum (East China),Qingdao,China,in 2018. He is currently pursuing the doctor degree with the College of Oceanography and Space Informatics,China University of Petroleum,China. His current research interests in the hyperspectral images classification.Yabin Hu received the B.S. and M.E. degrees in geography from Inner Mongolia Normal University,Hohhot,China,in 2013 and 2016,respectively,and the Ph.D. degree in computer application technology from Dalian Maritime University (DMU),Dalian,China,in 2020. He is currently a Post-Doctoral Researcher with the First Institute of Oceanogra phy,Ministry of Natural Resources,Qingdao,China. His research interests focus on hyperspectral image (HSI) remote sensing of coastal wetlands,multi source remote sensing technology,and application of typical marine ecosystems.Mingming Xu is received the B.S. degree in surveying and mapping engineering from China University of Petroleum,Qingdao,China,in 2011,and the Ph.D. degree in photogrammetry and remote sensing from the State Key Lab of Information Engineering in Surveying,Mapping,and Remote Sensing,Wuhan University,Wuhan,China,in 2016. She is currently an Assistant Professor with the College of Oceanography and Space Informatics,China University of Petroleum. Her research interests include hyperspectral image processing and intelligent computation.
DOI: https://doi.org/10.1080/01431161.2024.2408037
IF: 3.531
2024-10-05
International Journal of Remote Sensing
Abstract:The Transformer has become pivotal for the integrated analysis of multi-source remote-sensing (RS) data in Earth observation, particularly in applications such as the fusion classification of hyperspectral images (HSI) and Light Detection and Ranging (LiDAR) data. However, Transformers are often employed as effective feature extractors by adopting similar processing blocks for different modalities from multi-source sensors, overlooking differences in imaging principles and data characteristics. Moreover, in the feature extraction process across different sensor data, there is a lack of necessary cross-modal information interaction, leading to insufficient utilization of complementary information between different sensors and resulting in suboptimal fusion outcomes. In this paper, we propose an interactive Transformer and CNN network for the fusion classification of HSI and LiDAR data. Specifically, a heterogeneous three-branch network architecture is designed for HSI and LiDAR data, where Transformers and CNNs encapsulate global contextual spatial and spectral information for HSI and capture geometric elevation patterns for LiDAR data, respectively. Elevation-Spatial Interaction (ESI) and Spectral-Spatial Interaction (SSI) modules are then introduced for multi-stage feature interaction. ESI enables the CNN-Transformer network to focus on essential local elevation details while simultaneously modelling global contextual spatial information. SSI facilitates the Transformer-Transformer network to cyclically intertwine spectral and spatial information for long-range spectral-spatial feature fusion. Finally, the interacted elevation, spatial, and spectral features undergo the Gated Fusion module to achieve hierarchical fusion adaptively, resulting in an elevation-spatial-spectral representation. Experiments conducted on three benchmark HSI-LiDAR datasets demonstrate the effectiveness of our proposed approach.
imaging science & photographic technology,remote sensing