Large Scale Interactive Motion Forecasting for Autonomous Driving : The Waymo Open Motion Dataset

Scott Ettinger,Shuyang Cheng,Benjamin Caine,Chenxi Liu,Hang Zhao,Sabeek Pradhan,Yuning Chai,Ben Sapp,Charles Qi,Yin Zhou,Zoey Yang,Aurelien Chouard,Pei Sun,Jiquan Ngiam,Vijay Vasudevan,Alexander McCauley,Jonathon Shlens,Dragomir Anguelov
2021-04-21
Abstract:As autonomous driving systems mature, motion forecasting has received increasing attention as a critical requirement for planning. Of particular importance are interactive situations such as merges, unprotected turns, etc., where predicting individual object motion is not sufficient. Joint predictions of multiple objects are required for effective route planning. There has been a critical need for high-quality motion data that is rich in both interactions and annotation to develop motion planning models. In this work, we introduce the most diverse interactive motion dataset to our knowledge, and provide specific labels for interacting objects suitable for developing joint prediction models. With over 100,000 scenes, each 20 seconds long at 10 Hz, our new dataset contains more than 570 hours of unique data over 1750 km of roadways. It was collected by mining for interesting interactions between vehicles, pedestrians, and cyclists across six cities within the United States. We use a high-accuracy 3D auto-labeling system to generate high quality 3D bounding boxes for each road agent, and provide corresponding high definition 3D maps for each scene. Furthermore, we introduce a new set of metrics that provides a comprehensive evaluation of both single agent and joint agent interaction motion forecasting models. Finally, we provide strong baseline models for individual-agent prediction and joint-prediction. We hope that this new large-scale interactive motion dataset will provide new opportunities for advancing motion forecasting models.
Computer Vision and Pattern Recognition,Machine Learning,Robotics
What problem does this paper attempt to address?
The problems that this paper attempts to solve are: in an autonomous driving system, how to effectively predict the movements of multiple interacting objects, especially in complex interaction scenarios (such as lane - changing, unprotected turns, etc.). Traditional single - object motion prediction models are not sufficient to deal with these complex situations, so models capable of joint prediction need to be developed. Specifically, this paper mainly focuses on the following aspects: 1. **Lack of high - quality interaction data**: Most of the existing data sets focus on the representation of a single object and pay less attention to large - scale interaction modeling. In order to develop effective joint prediction models, a high - quality data set with rich interactions and annotations is required. 2. **Challenges of multi - object joint prediction**: In the autonomous driving environment, especially when it involves the interactions between vehicles, pedestrians and cyclists, it is not enough to predict the movement of a single object only. It is necessary to be able to predict the motion trajectories of multiple interacting objects simultaneously in order to achieve more effective path planning. 3. **Insufficiency of evaluation metrics**: The existing evaluation metrics are mainly for the prediction performance of a single object and lack effective evaluation methods for multi - object joint prediction. Therefore, new evaluation metrics need to be introduced to comprehensively measure the performance of joint prediction models. To solve these problems, the authors proposed a large - scale interactive motion data set named **WAYMO OPEN MOTION DATASET** and made the following contributions: - **Large - scale interaction data set**: This data set contains more than 100,000 scenes, each 20 seconds long, with a total duration of more than 570 hours, covering 1,750 kilometers of roads, and is collected from driving data in six cities in the United States. The data set contains rich interaction scenarios, such as vehicle - pedestrian interactions, vehicle - vehicle interactions, etc. - **High - quality annotations**: Use a high - precision 3D automatic annotation system to generate high - quality 3D bounding boxes and provide corresponding high - definition 3D maps. - **New evaluation metrics**: Introduce a set of new evaluation metrics, including minimum average displacement error (minADE), minimum final displacement error (minFDE), overlap rate (OR), missing rate (MR) and mean average precision (mAP), to comprehensively evaluate the performance of single - object and multi - object joint prediction models. - **Baseline models**: Provide powerful baseline models for single - object prediction and joint prediction to help researchers better understand and improve existing motion prediction models. Through these contributions, the authors hope that this new large - scale interactive motion data set can provide new opportunities and directions for the development of motion prediction models.