RadarMOSEVE: A Spatial-Temporal Transformer Network for Radar-Only Moving Object Segmentation and Ego-Velocity Estimation

Changsong Pang,Xieyuanli Chen,Yimin Liu,Huimin Lu,Yuwei Cheng
DOI: https://doi.org/10.1609/aaai.v38i5.28240
2024-02-22
Abstract:Moving object segmentation (MOS) and Ego velocity estimation (EVE) are vital capabilities for mobile systems to achieve full autonomy. Several approaches have attempted to achieve MOSEVE using a LiDAR sensor. However, LiDAR sensors are typically expensive and susceptible to adverse weather conditions. Instead, millimeter-wave radar (MWR) has gained popularity in robotics and autonomous driving for real applications due to its cost-effectiveness and resilience to bad weather. Nonetheless, publicly available MOSEVE datasets and approaches using radar data are limited. Some existing methods adopt point convolutional networks from LiDAR-based approaches, ignoring the specific artifacts and the valuable radial velocity information of radar measurements, leading to suboptimal performance. In this paper, we propose a novel transformer network that effectively addresses the sparsity and noise issues and leverages the radial velocity measurements of radar points using our devised radar self- and cross-attention mechanisms. Based on that, our method achieves accurate EVE of the robot and performs MOS using only radar data simultaneously. To thoroughly evaluate the MOSEVE performance of our method, we annotated the radar points in the public View-of-Delft (VoD) dataset and additionally constructed a new radar dataset in various environments. The experimental results demonstrate the superiority of our approach over existing state-of-the-art methods. The code is available at <a class="link-external link-https" href="https://github.com/ORCA-Uboat/RadarMOSEVE" rel="external noopener nofollow">this https URL</a>.
Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve two key capabilities for full autonomy in mobile systems - Moving Object Segmentation (MOS) and Ego - Vehicle Velocity Estimation (EVE). Specifically, the paper focuses on how to complete these two tasks simultaneously using only Millimeter - Wave Radar (MWR) data. Traditional methods usually rely on LiDAR sensors to achieve these functions, but LiDAR is costly and performs poorly in bad weather conditions. In contrast, millimeter - wave radar has the advantages of being cost - effective and robust to bad weather conditions, but is relatively limited in terms of publicly available datasets and methods. Some existing methods directly adopt the point - convolution network in LiDAR - based methods, ignoring specific artifacts in radar measurements and valuable radial - velocity information, resulting in unsatisfactory performance. Therefore, this paper proposes a new Transformer - network - based method, which is specifically optimized for the characteristics of radar data to solve these problems. ### Main Contributions 1. **Proposed a new radar Transformer module**: This module contains designed self - attention and cross - attention mechanisms, which can effectively handle the sparsity and noise problems of radar data and utilize the radial - velocity information of radar points. 2. **Proposed a new radar MOS and EVE framework**: This framework makes full use of the Doppler - velocity information of the radar to achieve MOS and EVE simultaneously. 3. **Demonstrated superior performance on multiple datasets**: The paper conducted experiments on the View - of - Delft (V oD) dataset and a newly constructed dataset. The results show that the proposed method outperforms the existing state - of - the - art methods in both MOS and EVE tasks. ### Method Overview - **Radar self - attention mechanism**: By introducing object - attention and scene - attention mechanisms, it can better capture the spatial information in the radar point cloud and extract more valuable features. - **Radar cross - attention mechanism**: Utilize the radar point - cloud data of consecutive frames to effectively capture the spatio - temporal dependencies and improve the accuracy of EVE and MOS. - **EVE module**: Estimate the ego - vehicle velocity of the robot through two - frame 4D radar point - cloud data, and adopt a new Doppler - loss function to improve the estimation accuracy. - **MOS module**: Use the output of the EVE module to compensate the radial velocity of the radar point cloud and input it into the MOS module for moving - object segmentation. ### Experimental Results - **MOS task**: On the self - constructed dataset and the V oD dataset, the RadarMOSEVE method proposed in the paper significantly outperforms traditional methods and deep - learning - based methods in multiple metrics such as Intersection over Union (IoU), F1 - score, and accuracy. - **EVE task**: On the self - constructed dataset, the RadarMOSEVE method performs excellently in Mean Absolute Error (MAE) and Mean Squared Error (MSE), and the accuracy under different ego - vehicle - velocity thresholds is also much higher than other methods. ### Conclusion This paper proposes a new MOS and EVE method based on radar data. Through the designed radar Transformer module and framework, it effectively solves the sparsity and noise problems of radar data, while utilizing the radial - velocity information of the radar to achieve high - performance MOS and EVE tasks. The experimental results verify the superiority and robustness of this method.