Abstract:Accurate 3D lane estimation is crucial for ensuring safety in autonomous driving. However, prevailing monocular techniques suffer from depth loss and lighting variations, hampering accurate 3D lane detection. In contrast, LiDAR points offer geometric cues and enable precise localization. In this paper, we present DV-3DLane, a novel end-to-end Dual-View multi-modal 3D Lane detection framework that synergizes the strengths of both images and LiDAR points. We propose to learn multi-modal features in dual-view spaces, i.e., perspective view (PV) and bird's-eye-view (BEV), effectively leveraging the modal-specific information. To achieve this, we introduce three designs: 1) A bidirectional feature fusion strategy that integrates multi-modal features into each view space, exploiting their unique strengths. 2) A unified query generation approach that leverages lane-aware knowledge from both PV and BEV spaces to generate queries. 3) A 3D dual-view deformable attention mechanism, which aggregates discriminative features from both PV and BEV spaces into queries for accurate 3D lane detection. Extensive experiments on the public benchmark, OpenLane, demonstrate the efficacy and efficiency of DV-3DLane. It achieves state-of-the-art performance, with a remarkable 11.2 gain in F1 score and a substantial 53.5% reduction in errors. The code is available at \url{<a class="link-external link-https" href="https://github.com/JMoonr/dv-3dlane" rel="external noopener nofollow">this https URL</a>}.

What problem does this paper attempt to address?

The paper primarily aims to address the accuracy issue of 3D lane detection in autonomous driving scenarios, particularly how to improve the performance of 3D lane detection in complex and variable environments (such as different weather and lighting conditions). To tackle this problem, the authors propose a new method called DV-3DLane, which is an end-to-end dual-view multimodal 3D lane detection framework. Specifically, the problems addressed in the paper can be summarized as: 1. **Overcoming the limitations of monocular techniques**: Traditional monocular camera-based techniques have issues with depth information loss and lighting variations, leading to inaccurate 3D lane detection. 2. **Leveraging the advantages of LiDAR**: Compared to monocular cameras, LiDAR can provide more accurate spatial positioning information, which helps improve the accuracy of 3D lane detection. 3. **Fusing image and LiDAR data**: By effectively fusing data from images and LiDAR, the method aims to fully utilize the advantages of both to enhance the effectiveness of 3D lane detection. To achieve these goals, the paper proposes the following key technical points: - **Bidirectional Feature Fusion (BFF) strategy**: This strategy fuses features between the image space (Perspective View, PV) and the bird's-eye view (BEV) to extract complementary information from both modalities. - **Unified Query Generator (UQG)**: This generates two sets of queries containing lane-related information and merges them into a unified query set for subsequent decoding processes. - **3D Dual-View Deformable Attention Mechanism**: This mechanism effectively aggregates features between the image space and the bird's-eye view space, thereby improving the accuracy of 3D lane detection. Through the above methods, DV-3DLane achieves significant performance improvements on the OpenLane dataset, particularly in terms of F1 score and error rate. Experimental results show that the method performs well even under stricter distance thresholds (e.g., 0.5 meters), demonstrating its potential in ensuring the safety of autonomous driving.

DV-3DLane: End-to-end Multi-modal 3D Lane Detection with Dual-view Representation

BEV-LaneDet: a Simple and Effective 3D Lane Detection Baseline

BEV-LaneDet: an Efficient 3D Lane Detection Based on Virtual Camera Via Key-Points

M$^2$-3DLaneNet: Exploring Multi-Modal 3D Lane Detection

Bi2Lane: Bi-Directional Temporal Refinement with Bi-Level Feature Aggregation for 3D Lane Detection

DILane: Dynamic Instance-Aware Network for Lane Detection

Advancements in 3D Lane Detection Using LiDAR Point Clouds: From Data Collection to Model Development

Anchor3DLane++: 3D Lane Detection Via Sample-Adaptive Sparse 3D Anchor Regression

Robust Monocular 3D Lane Detection with Dual Attention

Anchor3DLane: Learning to Regress 3D Anchors for Monocular 3D Lane Detection

ONCE-3DLanes: Building Monocular 3D Lane Detection

LATR: 3D Lane Detection from Monocular Images with Transformer

GroupLane: End-to-End 3D Lane Detection with Channel-wise Grouping

Data Driven 3D-Lane Detection Using Parallelism Loss Function

3D Lane Detection from Front or Surround-View using Joint-Modeling & Matching

A Multi-view 3D Vehicle Detection Method Based On Novel 3D Proposal Generation Method

Gen-LaneNet: A Generalized and Scalable Approach for 3D Lane Detection

3D Lane Detection With Attention in Attention

Deep Multi-Sensor Lane Detection

Learning to Detect 3D Lanes by Shape Matching and Embedding

An End-to-End Lane Detection Framework Based on Geometry Transform