TFIENet: Transformer Fusion Information Enhancement Network for Multimodel 3-D Object Detection.
Feng Cao,Yufeng Jin,Chongben Tao,Xizhao Luo,Zhen Gao,Zufeng Zhang,Sifa Zheng,Yuan Zhu
DOI: https://doi.org/10.1109/tim.2024.3451586
IF: 5.6
2024-01-01
IEEE Transactions on Instrumentation and Measurement
Abstract:During feature-level data fusion in 3-D object detection, the correlation between different modal data is destroyed by the misalignment problem, which leads to inaccurate localization of small targets at long distances. For the problem, a transformer fusion information enhancement network (TFIENet) is proposed. First, the original point cloud and color images are taken as input. Besides, the standard backbone network of feature extraction is passed to obtain LiDAR point cloud features and image features, respectively. Second, a region proposal network of transformer dual-fusion features is designed, which uses a deformable transformer-decoder to double fuse the extracted LiDAR point cloud features and image features based on a deformed attention mechanism. Moreover, the dual-domain feature information of the LiDAR camera is aggregated to generate the initial candidate frames. Then, the enhancement module of feature information is used to further refine the frame, which predicts the dense depth feature information using a depth complementation mechanism. The corresponding dense depth information and feature semantic information are extracted to complete the box refinement. Finally, for aligning and fusing feature information from different modalities effectively, a multimodal feature cross-attention module (MFCAM) is designed. Moreover, a dynamic cross-attention mechanism is applied to obtain the correlation between different modalities. Experimental results on the KITTI, NuScenes, and Waymo datasets demonstrate the generality and effectiveness of the proposed TFIENet method. Extensive ablation experiments demonstrate the efficiency of each individual module. Experimental results on a real road dataset show that the TFIENet algorithm has strong robustness in complex real road environments.