Abstract:Accurately and reliably perceiving the environment is a major challenge in autonomous driving and robotics research. Traditional vision-based methods often suffer from varying lighting conditions, occlusions, and complex environments. This paper addresses these challenges by combining a deep learning-based object detection algorithm, YOLOv8, with LiDAR data fusion technology. The principle of this combination is to merge the advantages of these technologies: YOLOv8 excels in real-time object detection and classification through RGB images, while LiDAR provides accurate distance measurement and 3D spatial information, regardless of lighting conditions. The integration aims to apply the high accuracy and robustness of YOLOv8 in identifying and classifying objects, as well as the depth data provided by LiDAR. This combination enhances the overall environmental perception, which is critical for the reliability and safety of autonomous systems. However, this fusion brings some research challenges, including data calibration between different sensors, filtering ground points from LiDAR point clouds, and managing the computational complexity of processing large datasets. This paper presents a comprehensive approach to address these challenges. Firstly, a simple algorithm is introduced to filter out ground points from LiDAR point clouds, which are essential for accurate object detection, by setting different threshold heights based on the terrain. Secondly, YOLOv8, trained on a customized dataset, is utilized for object detection in images, generating 2D bounding boxes around detected objects. Thirdly, a calibration algorithm is developed to transform 3D LiDAR coordinates to image pixel coordinates, which are vital for correlating LiDAR data with image-based object detection results. Fourthly, a method for clustering different objects based on the fused data is proposed, followed by an object tracking algorithm to compute the 3D poses of objects and their relative distances from a robot. The Agilex Scout Mini robot, equipped with Velodyne 16-channel LiDAR and an Intel D435 camera, is employed for data collection and experimentation. Finally, the experimental results validate the effectiveness of the proposed algorithms and methods.

Object Detection Based on Fusion of Sparse Point Cloud and Image Information

ObjectFusion: an Object Detection and Segmentation Framework with RGB-D SLAM and Convolutional Neural Networks

Object Detection and Information Perception by Fusing YOLO-SCG and Point Cloud Clustering

Research on 3D Point Cloud Object Detection Algorithm for Autonomous Driving

Three-Dimensional Point Cloud Object Detection Based on Feature Fusion and Enhancement

An Advanced Approach to Object Detection and Tracking in Robotics and Autonomous Vehicles Using YOLOv8 and LiDAR Data Fusion

RI-Fusion: 3D Object Detection Using Enhanced Point Features With Range-Image Fusion for Autonomous Driving.

Real time object detection using LiDAR and camera fusion for autonomous driving

Fusion Strategy of Multi-sensor Based Object Detection for Self-driving Vehicles.

A Multi-object Detection and Tracking Method Based on the Fusion of Lidar and Camera

Dense Sequential Fusion: Point Cloud Enhancement Using Foreground Mask Guidance for Multimodal 3-D Object Detection

Obstacle Detection by Fusing Point Clouds and Monocular Image

FusionRCNN: LiDAR-Camera Fusion for Two-stage 3D Object Detection

3D Vehicle Detection Using Multi-Level Fusion From Point Clouds and Images

Research on 3D Object Detection Based on Laser Point Cloud and Image Fusion

Three-Dimensional Object Detection Network Based on Multi-Layer and Multi-Modal Fusion

3D Object Detection Based on Extremely Sparse Laser Point Cloud and RGB Images

Object Detection Using Multi-Sensor Fusion Based on Deep Learning

Spatial Information Enhancement with Multi-Scale Feature Aggregation for Long-Range Object and Small Reflective Area Object Detection from Point Cloud

3D Instance Segmentation and Object Detection Framework Based on the Fusion of Lidar Remote Sensing and Optical Image Sensing