Learning single-shot vehicle orientation estimation from large-scale street panoramas.

Xinzhe Zhou,Yigeng Fang,Yadong Mu
DOI: https://doi.org/10.1016/j.neucom.2019.07.060
IF: 6
2019-01-01
Neurocomputing
Abstract:Modern autonomous driving techniques harness a variety of visual cues for performing key tasks like vehicle steering and collision avoidance. This work defines for the first time and proposes a solution to the following problem: can one precisely estimate an autonomous vehicle’s orientation relative to the road by taking a picture from within the vehicle? Vehicle orientation supplies crucial information for making various driving-related decisions. The key challenge of this problem is the lack of training data, since accurately annotating vehicle’s orientation information in an image is in general difficult. To circumvent the data scarcity issue, we propose to leverage publicly-available street panoramas as those found in GoogleMaps. Street panoramas are essentially defined by 2-D plenoptic functions that map the visual environment to a sphere. One can thus effortlessly collect annotated data for the interested task by specifying a direction and reading out corresponding pixels from a panorama. This way we collect a city-scale large benchmark TB500K for studying this problem. Our second contribution is the exposition of a two-stream deep network for attacking this problem. Intuitively, visual/semantic and geometric (e.g., the direction of road lanes) cues are both crucial and complement each other. Our proposed two-stream network can jointly learn from both cues and serves as a reasonable baseline for this novel problem. In addition, several experiments are also conducted to reveal key factors in estimating the vehicle orientation.
What problem does this paper attempt to address?