Abstract:In structural road scenarios, such as highway and urban roads, unexpected cut-in/cut-out maneuvers are one of the top reasons for fatal accidents, which the Advanced Driver Assistance System (ADAS) and Automated Driving Systems (ADS) should have the capability to predict and avoid timely. Existing cut-in prediction methods focus mainly on vehicles, and tend to apply convolution operation to the ROI covering target vehicles in RGB images to get the ROI feature vectors, and treat the cut-in prediction problem as a classification of time sequence. However, the dimension of the extracted ROI feature is large, and as local features, they lack essential global information. To tackle these challenges, in this paper, we propose a novel deep learning based framework to predict and classify the potentially dangerous cut-in maneuvers of surrounding vehicles in egocentric video clips. Our algorithm has two components: 1)Environment Perception. Specifically, in the environment perception part, we propose a two-branch architecture to predict and fuse the local information of surrounding vehicles with the global information of lane key-points, extending the range of perception. 2)Maneuvers Prediction. In particular, in the maneuvers prediction part, based on the perceptual information from the first part, we design status descriptors and an adaptive cut-in ROI to classify the early cut-in maneuvers, which bases on principle. In addition, we contribute a Cut-in Maneuver of Surrounding Vehicles dataset (CMSV dataset), containing over 1,413,371 frames with classification labeled. Experiment results reveal that 0.9135 accuracies for cut-ins can be obtained with our proposed framework.

Probabilistic Future Prediction for Video Scene Understanding

Multiple Futures Prediction

FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras

Predicting Long-horizon Futures by Conditioning on Geometry and Time

An End-to-End Future Frame Prediction Method for Vehicle-Centric Driving Videos

One-Step Time-Dependent Future Video Frame Prediction with a Convolutional Encoder-Decoder Neural Network

Implicit Latent Variable Model for Scene-Consistent Motion Forecasting

Predicting Deeper into the Future of Semantic Segmentation

PrognoseNet: A Generative Probabilistic Framework for Multimodal Position Prediction given Context Information

Self-supervised Multi-future Occupancy Forecasting for Autonomous Driving

SceneMotion: From Agent-Centric Embeddings to Scene-Wide Forecasts

Cut-in Prediction in Egocentric Videos Using Extended Environment Perception with Status Descriptors

Bayesian Prediction of Future Street Scenes through Importance Sampling based Optimization

Unsupervised video forecasting with flow parsing mechanism of human visual system

Video Prediction via Example Guidance

Forecasting Future Videos from Novel Views via Disentangled 3D Scene Representation

What-If Motion Prediction for Autonomous Driving

Vision-Guided Forecasting -- Visual Context for Multi-Horizon Time Series Forecasting

Jointy Predicting Future Sequence And Steering Angles For Dynamic Driving Scenes

Vehicle Motion Forecasting using Prior Information and Semantic-assisted Occupancy Grid Maps

Flow-guided Motion Prediction with Semantics and Dynamic Occupancy Grid Maps