Multi-class Indoor Semantic Segmentation with Deep Structured Model

Chuanxia Zheng,Jianhua Wang,Weihai Chen,Xingming Wu
DOI: https://doi.org/10.1007/s00371-017-1411-8
IF: 2.835
2018-01-01
The Visual Computer
Abstract:Indoor semantic segmentation plays a critical role in many applications, such as intelligent robots. However, multi-class recognition is still challenging, especially for pixel-level indoor semantic labeling. In this paper, a novel deep structured model that combines the strengths of the widely used convolutional neural networks (CNNs) and recurrent neural networks (RNNs) is proposed. We first present a multi-information fusion model that utilizes the scene category information to fine-tune the fully convolutional network. Then, to refine the coarse outputs of CNN, the RNN is applied to the final CNN layer so that we can build an end-to-end trainable system. This Graph-RNN is transformed from a conditional random field based on superpixel segmentation graphical modeling that can utilize flexible contextual information of different neighboring regions. The experimental results on the recent large SUN RGB-D dataset demonstrate that the proposed model outperforms existing state-of-the-art methods on the challenging 40 dominant classes task (\(40.8\%\) mean IU accuracy and \(69.1\%\) pixel accuracy). We also evaluate our model on the public NYU depth V2 dataset and achieve remarkable performance.
What problem does this paper attempt to address?