Obtain Datasets for Self-Driving Perception from Video Games Automatically

Yifeng Huang,Dong Dong,Chuan Lv
DOI: https://doi.org/10.1109/ICRMS.2018.00046
2018-01-01
Abstract:Perception in self-driving car is a challenging computer vision task that need to detect the location of vehicles, pedestrians and cyclists from images. Recent progress in object detection and localization developing is highly relied on the huge number of datasets. Unfortunately, when we try to create these large datasets we will find out it has been very costly on human work and professional equipment required. We will present an approach in this paper so that we can creating pixel-level segmentation label maps for images and the depth of the images extracted from some modern AAA computer games like GTA5. The fidelity of these computer games is very impressive so that it can be used even directly for the real-world image object detection and localization. Without any human efforts and depth camera we can generating datasets from a photorealistic computer game automatically that have pixel-level labels and depth map of cars and pedestrians. We generate thousands of datasets for testing and the amount of it is still growing. Our datasets can be used in real-world situation with transfer learning approach. Not only image segmentation and depth data, we also can extract a lot more interesting dates from games just like image to coordinate data and image to driving pose data whose labels are extract directly from the metadata of computer games.
What problem does this paper attempt to address?