Minimum Volume Simplex-Based Scene Representation and Attribute Recognition with Feature Fusion.

Zou Zhiyuan,Liu Weibin,Xing Weiwei,Zhang Shunli
DOI: https://doi.org/10.1007/s10489-022-03697-9
IF: 5.3
2022-01-01
Applied Intelligence
Abstract:Scene attribute recognition is to identify attribute labels of one scene image based on scene representation for deeper semantic understanding of scenes. In the past decades, numerous algorithms for scene representation have been proposed by feature engineering or deep convolutional neural network. For models based on only one kind of image feature, it is still difficult to learn the representation of multiple attributes from local image region. For models based on deep learning, despite multi-label can be directly used for learning attributes representation, huge training data are usually necessary to build the multi-label model. In this paper, we investigate the problem by the way of scene representation modeling with multi-feature and non-deep learning. Firstly, we introduce linear mixing model (LMM) for scene image modeling, then present a novel approach, referred to as the mini-batch minimum simplex estimation (MMSE), for attribute-based scene representation learning from highly complex image data. Finally, a two-stage multi-feature fusion method is proposed to further improve the feature representation for scene attribute recognition. The proposed method takes advantage of the fast convergence of nonnegative matrix factorization (NMF) schemes, and at the same time using mini-batch to speed up the computation for large-scale scene dataset. The experimental results based on real image scene demonstrate that the proposed method outperforms several other advanced scene attribute recognition approaches.
What problem does this paper attempt to address?