Reinforced Wasserstein Training for Severity-Aware Semantic Segmentation in Autonomous Driving

Xiaofeng Liu,Yimeng Zhang,Xiongchang Liu,Song Bai,Site Li,Jane You
DOI: https://doi.org/10.48550/arXiv.2008.04751
2020-08-11
Abstract:Semantic segmentation is important for many real-world systems, e.g., autonomous vehicles, which predict the class of each pixel. Recently, deep networks achieved significant progress w.r.t. the mean Intersection-over Union (mIoU) with the cross-entropy loss. However, the cross-entropy loss can essentially ignore the difference of severity for an autonomous car with different wrong prediction mistakes. For example, predicting the car to the road is much more servery than recognize it as the bus. Targeting for this difficulty, we develop a Wasserstein training framework to explore the inter-class correlation by defining its ground metric as misclassification severity. The ground metric of Wasserstein distance can be pre-defined following the experience on a specific task. From the optimization perspective, we further propose to set the ground metric as an increasing function of the pre-defined ground metric. Furthermore, an adaptively learning scheme of the ground matrix is proposed to utilize the high-fidelity CARLA simulator. Specifically, we follow a reinforcement alternative learning scheme. The experiments on both CamVid and Cityscapes datasets evidenced the effectiveness of our Wasserstein loss. The SegNet, ENet, FCN and Deeplab networks can be adapted following a plug-in manner. We achieve significant improvements on the predefined important classes, and much longer continuous playtime in our simulator.
Computer Vision and Pattern Recognition,Machine Learning,Performance,Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in the semantic segmentation task in autonomous driving, the differences in the severity of different mispredictions are not fully considered by current models. Specifically, existing models are usually trained using the cross - entropy loss function. This method does not take into account the different severe consequences that misclassifications may have on the autonomous driving system when evaluating errors in different pixel categories. For example, misclassifying a car as a road is much more serious than misclassifying it as a bus, because the former may lead to a serious traffic accident, while the latter will also cause problems, but the danger is relatively low. To address this challenge, the author proposes a training framework based on the Wasserstein distance. By defining a ground metric to reflect the severity of misclassifications, the inter - class correlations are explored. This ground metric can be preset in advance according to the experience of specific tasks, or it can be dynamically adjusted in the high - fidelity CARLA simulator through an adaptive learning scheme. In addition, the author also proposes an alternating optimization scheme based on reinforcement learning to further optimize the ground metric matrix and the driving strategy. In summary, this paper aims to improve the reliability and safety of the semantic segmentation task in the autonomous driving system by introducing a loss function that takes into account the severity of misclassifications.