Abstract:Accurately and reliably perceiving the environment is a major challenge in autonomous driving and robotics research. Traditional vision-based methods often suffer from varying lighting conditions, occlusions, and complex environments. This paper addresses these challenges by combining a deep learning-based object detection algorithm, YOLOv8, with LiDAR data fusion technology. The principle of this combination is to merge the advantages of these technologies: YOLOv8 excels in real-time object detection and classification through RGB images, while LiDAR provides accurate distance measurement and 3D spatial information, regardless of lighting conditions. The integration aims to apply the high accuracy and robustness of YOLOv8 in identifying and classifying objects, as well as the depth data provided by LiDAR. This combination enhances the overall environmental perception, which is critical for the reliability and safety of autonomous systems. However, this fusion brings some research challenges, including data calibration between different sensors, filtering ground points from LiDAR point clouds, and managing the computational complexity of processing large datasets. This paper presents a comprehensive approach to address these challenges. Firstly, a simple algorithm is introduced to filter out ground points from LiDAR point clouds, which are essential for accurate object detection, by setting different threshold heights based on the terrain. Secondly, YOLOv8, trained on a customized dataset, is utilized for object detection in images, generating 2D bounding boxes around detected objects. Thirdly, a calibration algorithm is developed to transform 3D LiDAR coordinates to image pixel coordinates, which are vital for correlating LiDAR data with image-based object detection results. Fourthly, a method for clustering different objects based on the fused data is proposed, followed by an object tracking algorithm to compute the 3D poses of objects and their relative distances from a robot. The Agilex Scout Mini robot, equipped with Velodyne 16-channel LiDAR and an Intel D435 camera, is employed for data collection and experimentation. Finally, the experimental results validate the effectiveness of the proposed algorithms and methods.

Semantically Enhanced Multi-Object Detection and Tracking for Autonomous Vehicles.

Online Multi-Object Tracking from A Bird's-Eye View by Fusion of Millimeter-Wave Radar and Vision

PF-MOT: Probability Fusion Based 3D Multi-Object Tracking for Autonomous Vehicles

Dynamic Multi-LiDAR Based Multiple Object Detection and Tracking

Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous Driving

Object-Level Pseudo-3D Lifting for Distance-Aware Tracking

Multi-Camera Multiple 3D Object Tracking on the Move for Autonomous Vehicles

3D Multi-object Detection and Tracking with Sparse Stationary LiDAR

Lightweight Map-Enhanced 3D Object Detection and Tracking for Autonomous Driving.

Enhancing 3D object detection through multi-modal fusion for cooperative perception

Multi-Object Tracking with Camera-LiDAR Fusion for Autonomous Driving

Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences

Joint Multi-Object Detection and Tracking with Camera-LiDAR Fusion for Autonomous Driving

3-D Objects Detection and Tracking Using Solid-State LiDAR and RGB Camera

Spatial Information Enhancement with Multi-Scale Feature Aggregation for Long-Range Object and Small Reflective Area Object Detection from Point Cloud

Deep Learning-Based Robust Multi-Object Tracking via Fusion of mmWave Radar and Camera Sensors

An Effective Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds

3D Multi-Object Tracking Employing MS-GLMB Filter for Autonomous Driving

3D Multi-Object Tracking Based on Radar-Camera Fusion

An Advanced Approach to Object Detection and Tracking in Robotics and Autonomous Vehicles Using YOLOv8 and LiDAR Data Fusion

Transformer-Based Optimized Multimodal Fusion for 3D Object Detection in Autonomous Driving