DeFlow: Decoder of Scene Flow Network in Autonomous Driving

Qingwen Zhang,Yi Yang,Heng Fang,Ruoyu Geng,Patric Jensfelt
2024-01-29
Abstract:Scene flow estimation determines a scene's 3D motion field, by predicting the motion of points in the scene, especially for aiding tasks in autonomous driving. Many networks with large-scale point clouds as input use voxelization to create a pseudo-image for real-time running. However, the voxelization process often results in the loss of point-specific features. This gives rise to a challenge in recovering those features for scene flow tasks. Our paper introduces DeFlow which enables a transition from voxel-based features to point features using Gated Recurrent Unit (GRU) refinement. To further enhance scene flow estimation performance, we formulate a novel loss function that accounts for the data imbalance between static and dynamic points. Evaluations on the Argoverse 2 scene flow task reveal that DeFlow achieves state-of-the-art results on large-scale point cloud data, demonstrating that our network has better performance and efficiency compared to others. The code is open-sourced at
Computer Vision and Pattern Recognition,Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve real - time scene flow estimation in the autonomous driving scenario. Specifically, scene flow estimation aims to determine the 3D motion field in the scene and assist in autonomous driving tasks by predicting the motion of each point in the scene. However, existing methods face some challenges when dealing with large - scale point cloud data: 1. **Processing of large - scale point cloud data**: Many existing methods will encounter the problem of memory overflow when dealing with large - scale point cloud data. Especially in modern driving datasets, the number of points per frame is usually between 80,000 and 177,000. This makes traditional optimization methods difficult to use in real - time applications. 2. **Feature loss caused by voxelization**: In order to achieve real - time processing, many methods use voxelization technology to convert point clouds into pseudo - images. However, this process often leads to the loss of point - specific features, thus affecting the performance of the scene flow task. The voxelized point cloud has deficiencies in decoder design and cannot effectively distinguish different point features within the same voxel. To solve these problems, the paper proposes the DeFlow method, and its main contributions include: - **Introduction of a new network architecture**: DeFlow realizes an effective transition from voxel features to point features through the use of gated recurrent unit (GRU) refinement modules, significantly improving the accuracy of the final results. - **Proposing a new loss function**: This loss function is specifically optimized for the data imbalance between static and dynamic points, further enhancing the performance of scene flow estimation. - **Excellent performance on large - scale point cloud datasets**: DeFlow has achieved state - of - the - art results on the Argoverse 2 online leaderboard, demonstrating its efficiency and accuracy in dealing with large - scale point cloud data. Through these innovations, DeFlow can achieve real - time scene flow estimation while maintaining high performance, and is suitable for practical application scenarios such as autonomous driving.