MS3Net: a deep ensemble learning approach for ship classification in heterogeneous remote sensing data

Bole Wilfried TieninGuolong CuiChiagoziem Chima UkwuomaYannick Abel Talla NanaRoldan Mba EsidangEguer Zacarias Moniz Moreiraa School of Information and Communication Engineering,University of Electronic Science and Technology of China,Chengdu,Sichuan,Chinab College of Nuclear Technology and Automation Engineering,Chengdu University of Technology,Sichuan,Chinac Sichuan Engineering Technology Research Center for Industrial Internet Intelligent Monitoring and Application,Chengdu University of Technology,Sichuan,Chinad School of Information Science and Technology,Southwest Jiaotong University,Chengdu,Sichuan,China
DOI: https://doi.org/10.1080/01431161.2024.2302953
IF: 3.531
2024-01-29
International Journal of Remote Sensing
Abstract:Maritime ship classification is essential for effectively monitoring oceanic activities but faces challenges when using heterogeneous remote sensing data. This research presents a novel dataset called Heterogeneous Ship Data, created by combining synthetic aperture radar (SAR) and optical satellite images of ships. The integration of multimodal data aims to harness the complementary advantages of SAR's all-weather imaging capabilities and optical data's higher interretability and shape discrimination. To leverage the heterogeneous dataset, this paper proposes an ensemble deep learning model called MS3Net (Multi-convolutional network using Spatial attention and Second-order pooling for Ship image classification). MS3Net integrates spatial attention mechanisms and second-order pooling. MS3Net has multiple convolutional branches, each tailored to extract distinct representations from the SAR and optical data through varied filter sizes and kernels. The spatial attention units focus the learning on the most informative parts of the image while suppressing irrelevant features. The second-order pooling captures higher-order interactions between feature elements to model complex relationships. A concatenation fusion strategy aggregates the diverse branch outputs into a joint multimodal feature representation for enhanced discrimination. Extensive experiments demonstrate MS3Net's superior classification accuracy over prior state-of-the-art approaches. The proposed heterogeneous data representation coupled with the attention-based deep ensemble learning model contributes to maritime surveillance by enabling robust ship recognition across diverse sensing modalities.
imaging science & photographic technology,remote sensing
What problem does this paper attempt to address?