Detection and Retrieval of Out-of-Distribution Objects in Semantic Segmentation

Philipp Oberdiek,Matthias Rottmann,Gernot A. Fink
DOI: https://doi.org/10.48550/arXiv.2005.06831
2020-05-14
Abstract:When deploying deep learning technology in self-driving cars, deep neural networks are constantly exposed to domain shifts. These include, e.g., changes in weather conditions, time of day, and long-term temporal shift. In this work we utilize a deep neural network trained on the Cityscapes dataset containing urban street scenes and infer images from a different dataset, the A2D2 dataset, containing also countryside and highway images. We present a novel pipeline for semantic segmenation that detects out-of-distribution (OOD) segments by means of the deep neural network's prediction and performs image retrieval after feature extraction and dimensionality reduction on image patches. In our experiments we demonstrate that the deployed OOD approach is suitable for detecting out-of-distribution concepts. Furthermore, we evaluate the image patch retrieval qualitatively as well as quantitatively by means of the semi-compatible A2D2 ground truth and obtain mAP values of up to 52.2%.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to detect and retrieve out - of - distribution (OOD) objects in semantic segmentation tasks. Specifically, when deep - learning techniques are applied to self - driving cars, deep neural networks are constantly exposed to domain shifts, which include changes in weather conditions, changes in the time of day, and long - term time shifts. The author uses a deep neural network trained on the Cityscapes dataset to infer images from different datasets (the A2D2 dataset, which contains rural and highway images) and proposes a new semantic segmentation pipeline. This pipeline detects out - of - distribution segments through the predictions of the deep neural network and performs image retrieval after feature extraction and dimension reduction. The main contributions of the paper include: 1. Demonstrating that MetaSeg can reliably predict the intersection over union (IoU) of domain samples. 2. Using MetaSeg to prove that unknown object classes can be detected. 3. By extracting visual features, being able to group the discovered entities into an embedding space with semantic relevance. 4. Evaluating the image retrieval task using several common deep - learning architectures as feature extractors. To achieve these goals, the paper details how to use MetaSeg for OOD detection and how to explore unknown objects in newly collected data through content - based image retrieval techniques, thereby revealing the weaknesses of the model under specific domain shifts. This not only helps to improve existing models but also guides future data collection and model training processes.