Abstract:Although autonomous vehicles (AVs) are expected to revolutionize transportation, robust perception across a wide range of driving contexts remains a significant challenge. Techniques to fuse sensor data from camera, radar, and lidar sensors have been proposed to improve AV perception. However, existing methods are insufficiently robust in difficult driving contexts (e.g., bad weather, low light, sensor obstruction) due to rigidity in their fusion implementations. These methods fall into two broad categories: (i) early fusion, which fails when sensor data is noisy or obscured, and (ii) late fusion, which cannot leverage features from multiple sensors and thus produces worse estimates. To address these limitations, we propose HydraFusion: a selective sensor fusion framework that learns to identify the current driving context and fuses the best combination of sensors to maximize robustness without compromising efficiency. HydraFusion is the first approach to propose dynamically adjusting between early fusion, late fusion, and combinations in-between, thus varying both how and when fusion is applied. We show that, on average, HydraFusion outperforms early and late fusion approaches by 13.66% and 14.54%, respectively, without increasing computational complexity or energy consumption on the industry-standard Nvidia Drive PX2 AV hardware platform. We also propose and evaluate both static and deep-learning-based context identification strategies. Our open-source code and model implementation are available at <a class="link-external link-https" href="https://github.com/AICPS/hydrafusion" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to implement a robust and efficient perception system in autonomous vehicles (AVs). Specifically, the paper focuses on how to improve the environmental perception ability of self - driving cars by fusing data from sensors such as cameras, radars and Light Detection and Ranging (LiDAR) in various driving environments. Existing sensor data fusion methods perform poorly under harsh driving conditions (such as bad weather, low light, sensor occlusion, etc.), mainly because these methods are too rigid in the implementation of fusion and cannot adapt flexibly to different driving scenarios. Therefore, the paper proposes a selective sensor fusion framework named HydraFusion, which aims to dynamically select the best sensor combination for fusion according to the current driving environment, so as to improve the robustness of the system without sacrificing efficiency. The main contributions of HydraFusion are as follows: 1. Propose a novel multi - branch sensor fusion architecture that can achieve early - stage fusion, late - stage fusion and intermediate - stage fusion. 2. Introduce an intelligent, context - based gating strategy to maximize robustness by dynamically selecting fusion methods. 3. Verify the effectiveness of the method on a real - world dataset containing multiple driving environments. 4. Implement the method on an industry - standard AV hardware platform, demonstrating the feasibility of its actual deployment while maintaining energy consumption, latency and memory usage comparable to existing state - of - the - art methods. 5. Open - source the algorithm implementation and architecture, promoting further research on selective sensor fusion methods in the research community. The paper shows through theoretical analysis and qualitative analysis that in some cases, not all available sensor measurement data should be fused, which may reduce the accuracy of the estimate. HydraFusion solves this problem by selectively fusing sensor data according to the scene context, thereby achieving more accurate object detection in different driving environments.

HydraFusion: Context-Aware Selective Sensor Fusion for Robust and Efficient Autonomous Vehicle Perception

EcoFusion: Energy-Aware Adaptive Sensor Fusion for Efficient Autonomous Vehicle Perception

Real-Time Hybrid Multi-Sensor Fusion Framework for Perception in Autonomous Vehicles

Autonomous Multi-Sensor Fusion Techniques for Environmental Perception in Self-Driving Vehicles

ContextualFusion: Context-Based Multi-Sensor Fusion for 3D Object Detection in Adverse Operating Conditions

Higher Accuracy and Lower Computational Perception Environment Based Upon a Real-time Dynamic Region of Interest

Learning Selective Sensor Fusion for State Estimation

Multi-Modality Cascaded Fusion Technology for Autonomous Driving

Scalable Primitives for Generalized Sensor Fusion in Autonomous Vehicles

Learning Selective Sensor Fusion for States Estimation

Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

Learnable fusion mechanisms for multimodal object detection in autonomous vehicles

Towards Efficient Architecture and Algorithms for Sensor Fusion

Enhanced Perception for Autonomous Driving Using Semantic and Geometric Data Fusion

Enabling Efficient Deep Convolutional Neural Network-based Sensor Fusion for Autonomous Driving

TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving

Towards Self-Supervised High Level Sensor Fusion

Sensor Fusion Method for Object Detection and Distance Estimation in Assisted Driving Applications

Robust Cognitive Capability in Autonomous Driving Using Sensor Fusion Techniques: A Survey

Improving Autonomous Vehicle Visual Perception by Fusing Human Gaze and Machine Vision