Homography Guided Temporal Fusion for Road Line and Marking Segmentation

Shan Wang,Chuong Nguyen,Jiawei Liu,Kaihao Zhang,Wenhan Luo,Yanhao Zhang,Sundaram Muthu,Fahira Afzal Maken,Hongdong Li
2024-04-11
Abstract:Reliable segmentation of road lines and markings is critical to autonomous driving. Our work is motivated by the observations that road lines and markings are (1) frequently occluded in the presence of moving vehicles, shadow, and glare and (2) highly structured with low intra-class shape variance and overall high appearance consistency. To solve these issues, we propose a Homography Guided Fusion (HomoFusion) module to exploit temporally-adjacent video frames for complementary cues facilitating the correct classification of the partially occluded road lines or markings. To reduce computational complexity, a novel surface normal estimator is proposed to establish spatial correspondences between the sampled frames, allowing the HomoFusion module to perform a pixel-to-pixel attention mechanism in updating the representation of the occluded road lines or markings. Experiments on ApolloScape, a large-scale lane mark segmentation dataset, and ApolloScape Night with artificial simulated night-time road conditions, demonstrate that our method outperforms other existing SOTA lane mark segmentation models with less than 9\% of their parameters and computational complexity. We show that exploiting available camera intrinsic data and ground plane assumption for cross-frame correspondence can lead to a light-weight network with significantly improved performances in speed and accuracy. We also prove the versatility of our HomoFusion approach by applying it to the problem of water puddle segmentation and achieving SOTA performance.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper mainly addresses the problem of lane line and marking segmentation in autonomous driving scenarios, proposing a new method to improve segmentation accuracy, particularly solving the common issue of partial occlusion of lane lines and markings in real driving environments. The key contributions proposed in the paper include: 1. **Homography Guided Fusion (HomoFusion) Module**: This module leverages the complementary information between adjacent video frames to restore occluded lane lines or markings by estimating a homography transformation matrix, thereby improving classification accuracy. Specifically, it updates the representation of occluded areas by establishing spatial correspondences between pixels in different frames to achieve better classification results. 2. **Road Surface Normal Estimator (RSNE)**: This is a novel method for estimating road surface normals. By combining the intrinsic and extrinsic parameters of the camera mounted on the vehicle, an accurate homography transformation matrix can be obtained, thereby establishing spatial correspondences between different frames. RSNE simplifies the homography transformation problem, reducing it from an 8 degrees of freedom (DoF) problem to a normal vector problem with only 2 degrees of freedom. 3. **Lightweight Lane Marking Segmentation Model**: Compared to existing technologies, this model significantly reduces model complexity and computational requirements while ensuring performance. Specifically, the proposed model achieves better performance than the current state-of-the-art (SOTA) methods using less than 9% of the parameters and computational load. Through experimental validation, the method demonstrates superior performance on the ApolloScape dataset and the ApolloScape Night dataset, which simulates nighttime conditions, particularly in handling challenges such as occlusion, shadows, and reflections. Additionally, the authors discuss in detail the selection of key hyperparameters in the model and further validate the effectiveness of the proposed method through ablation studies.