Building Facade Parsing R-CNN

Sijie Wang,Qiyu Kang,Rui She,Wee Peng Tay,Diego Navarro Navarro,Andreas Hartmannsgruber
DOI: https://doi.org/10.48550/arXiv.2205.05912
2022-05-12
Abstract:Building facade parsing, which predicts pixel-level labels for building facades, has applications in computer vision perception for autonomous vehicle (AV) driving. However, instead of a frontal view, an on-board camera of an AV captures a deformed view of the facade of the buildings on both sides of the road the AV is travelling on, due to the camera perspective. We propose Facade R-CNN, which includes a transconv module, generalized bounding box detection, and convex regularization, to perform parsing of deformed facade views. Experiments demonstrate that Facade R-CNN achieves better performance than the current state-of-the-art facade parsing models, which are primarily developed for frontal views. We also publish a new building facade parsing dataset derived from the Oxford RobotCar dataset, which we call the Oxford RobotCar Facade dataset. This dataset contains 500 street-view images from the Oxford RobotCar dataset augmented with accurate annotations of building facade objects. The published dataset is available at <a class="link-external link-https" href="https://github.com/sijieaaa/Oxford-RobotCar-Facade" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?