Multi-modal Information Fusion for LiDAR-based 3D Object Detection Framework

Ruixin Ma,Yong Yin,Jing Chen,Rihao Chang
DOI: https://doi.org/10.1007/s11042-023-15452-4
IF: 2.577
2023-01-01
Multimedia Tools and Applications
Abstract:With the rapid development of water transportation, ship safety supervision is facing more severe pressures and challenges. Precise and efficient detection of ship targets is becoming more and more important, which urgently requires intelligent detection methods to ultimately improves shipping management efficiency. However, the surveillance video of waterway transportation is often influenced by fog and rain, which can affect the performance of object detection and reduce the efficiency of management. The current traditional object approaches are hard to handle these problems. In this paper, we propose a novel multi-modal information fusion method to handle multi-object detection in waterway transportation, which introduces the LiDAR (Light Detection And Ranging) dataset to add spatial information and handle the interference of fog and rain. The target ROI (Region Of Interest) point cloud and image data are initially fused in the pre-fusion stage. This phase can efficiently direct the network’s attention to the region with the highest target probability, increasing the target recall rate. The 3D bounding box in the point cloud and 2D bounding boxes in the image retrieved are then fused in the post-fusion stage to improve target precision and enrich target detection information. Finally, using time synchronization and a space transformation matrix, the detection result is transferred to the picture coordinate system to create a ship image target with 3D depth information. This technique overcomes the constraints of single-sensor environment perception, adapts to the detection of ship targets in a variety of situations, and is more precise and robust. The algorithm’s superiority is also demonstrated by the experiments.
What problem does this paper attempt to address?