A deep learning-based global and segmentation-based semantic feature fusion approach for indoor scene classification

Ricardo Pereira,Tiago Barros,Luís Garrote,Ana Lopes,Urbano J. Nunes
DOI: https://doi.org/10.1016/j.patrec.2024.01.022
IF: 4.757
2024-01-26
Pattern Recognition Letters
Abstract:This work proposes a novel approach that uses a semantic segmentation mask to obtain a 2D spatial layout of the segmentation-categories across the scene, designated by segmentation-based semantic features (SSFs). These features represent, per segmentation-category, the pixel count, as well as the 2D average position and respective standard deviation values. Moreover, a two-branch network, GS 2 F 2 App, that exploits CNN-based global features extracted from RGB images and the segmentation-based features extracted from the proposed SSFs, is also proposed. GS 2 F 2 App was evaluated in two indoor scene benchmark datasets: the SUN RGB-D and the NYU Depth V2, achieving state-of-the-art results on both datasets.
computer science, artificial intelligence
What problem does this paper attempt to address?