KCDNet: Multimodal Object Detection in Modal Information Imbalance Scenes

Haoyu Wang,Shiyuan Qu,Zhenzhuang Qiao,Xiaomin Liu
DOI: https://doi.org/10.1109/tim.2024.3441019
IF: 5.6
2024-08-25
IEEE Transactions on Instrumentation and Measurement
Abstract:Inspired by human's use of multiple senses to perceive the world, multimodal object detection methods can adapt to the environment by integrating information from different modalities. However, different modalities are prone to exhibit significant modal heterogeneity due to differences in data collection instrumentation and measurement. This heterogeneity leads to significant differences in the amount of task-related information contained in different modalities in the same scenario, that is, modal information imbalance scene. To solve the problem, we innovatively categorized modal information imbalance into two types: local modal information imbalance and global modal information imbalance, and purposefully proposed a knowledge complementary detection network (KCDNet). Specifically, first, the information entropy assessment mechanism (IEAM) was designed, which achieved the identification of modal information imbalance by quantifying the amount of task-related information in multimodal data. Then, the knowledge complementary mechanism was designed, which alleviated the local modal information imbalance through category knowledge complementary and spatial knowledge complementary, so as to suppress the intermodal information interference. Finally, the dynamic balancing mechanism (DBM) was proposed to dynamically monitor and balance the model's preference to different modalities during model learning, so as to alleviate the global modal information imbalance. The above operation can ensure the synchronization of multimodal feature learning and improve the model's ability to mine complementary information. The applicability of various object detectors shows that the KCDNet outperforms the existing state-of-the-art methods with noticeable margins.
engineering, electrical & electronic,instruments & instrumentation
What problem does this paper attempt to address?