Multimodal Target Detection Algorithm Based on Adaptive Feature Fusion

Yitong Li,Chuchao He,Ruohai Di,Peng Wang,Mengyu Sun,Xiaoyan Li
DOI: https://doi.org/10.1117/12.3039437
2024-01-01
Abstract:The multimodal target detection algorithm has the problem of poor feature fusion ability of different modes, which leads to poor detection accuracy. Therefore, this paper improves and optimizes the MVX-Net algorithm, and proposes an adaptive multi-modal feature fusion algorithm AF-MVX-Net (adaptive fusion). The algorithm is based on the MVX-Net framework, and an adaptive multi-modal feature Fusion module AFM (Adaptation Fusion Module) is added. The module was designed by analyzing the relationship between local and global features to adaptively enhance the weighting of important features in the fused data to improve the effectiveness of multimodal fusion, thus improving detection accuracy. The results of the experimental verification on the KITTI dataset demonstrate that the average 3DAP value of all categories of simple targets has increased by 8.55% to 76.1%. ; For vehicle categories, the value of 3DAP@0.7 increased by 2%; Bicycle category 3DAP@0.5 value increased by 5~6%; The 3DAP@0.5 value of the pedestrian category increased by 10~13%, which effectively improves the detection accuracy of bicycles, pedestrians and vehicles in the automatic driving scenario, so FA-MVX-Net algorithm is proved to be effective.
What problem does this paper attempt to address?