MGSGNet-S*: Multilayer Guided Semantic Graph Network Via Knowledge Distillation for RGB-Thermal Urban Scene Parsing

Wujie Zhou,Hongping Wu,Qiuping Jiang
DOI: https://doi.org/10.1109/tiv.2024.3456437
IF: 8.2
2024-01-01
IEEE Transactions on Intelligent Vehicles
Abstract:Owing to rapid developments in driverless technologies, vision tasks for unmanned vehicles have gained considerable attention, particularly in multimodal-based urban scene parsing. Although deep-learning algorithms have outperformed traditional models in such tasks, they cannot operate on mobile devices and edge networks owing to the coarse-grained cross-modal complementary information alignment, inadequate modeling of semantic-category relations, overabundance of parameters, and high computational complexity. To address these issues, a multilayer guided semantic graph network via knowledge distillation (MGSGNet-S*) is proposed for red-green-blue-thermal urban scene parsing. First, a new cross-modal adaptive fusion module adjusts pixel-level adaptive modal complementary information by incorporating additional deep modal information and residual cross-modal matrix fine-grained attention. Second, a novel semantic graph module overcomes the misclassification problems of objects of the same semantic class during low-level encoding by incorporating high-level information in the Euclidean space and modeling semantic graph relationships in the non-Euclidean space. Finally, to strike the balance between accuracy and efficiency, a tailored framework optimally utilizes effective knowledge of pixel intra- and inter-class similarity, fusion features, and cross-modal correlation. Experimental results indicate that MGSGNet-S* considerably outperforms relevant state-of-the-art methods with fewer parameters and lower computational costs. The numbers of parameters and floating-point operations were reduced by 95.69% and 93.34%, respectively, relative to those for the teacher model, thus demonstrating stronger inferencing capabilities at 28.65 frames per second. The source codes and results are available at https://github.com/Tortoisewhp/MGSGNet.
What problem does this paper attempt to address?