Image Enhancement and Translation for RGB-D Indoor Scene Recognition

Sifan Yang,Yue Wang,Yang Li,Guijin Wang
DOI: https://doi.org/10.1145/3408127.3408174
2020-01-01
Abstract:Most existing methods for RGB-D indoor scene recognition adopt the backbone networks designed for image recognition. They overly focus on global features but largely ignore the local features, resulting in unsatisfactory accuracy in practice. This paper proposes a T-like network called T-Net to comprehensively exploit both global and local features by multi-scale supervision. In detail, we add an image translation branch and introduce pixel-level semantic segmentation annotations along with the image-level labels, to jointly supervise the model to excavate more regions of objects. In addition, low-quality source images without image enhancement cause difficulty in extracting representative features. To address this issue, the Multi-Scale Retinex with Color Restoration (MSRCR) is introduced to enhance the brightness and contrast of the RGB images. We demonstrate that the proposed method achieves superior performance to the state-of-the-art methods on SUN RGB-D and NYU Depth v2 Datasets.
What problem does this paper attempt to address?