Attention-Based Deep Neural Network Combined Local and Global Features for Indoor Scene Recognition

Luefeng Chen,Wenhao Duan,Jiazhuo Li,Min Wu,Witold Pedrycz,Kaoru Hirota
DOI: https://doi.org/10.1109/tii.2024.3424197
IF: 12.3
2024-01-01
IEEE Transactions on Industrial Informatics
Abstract:An original attention-based indoor scene recognition model combining local and global features is proposed. Multi-strategy data augmentation using several different functions and intensities can improve the classification performance. Then, local features are extracted using a convolutional layer and a single self-attention, thus solving the problem of large intra-class variance. The multi-attention mechanism is used to fuse the local feature information extracted from different foci to obtain a more complete global feature representation. The multi-head attention mechanism allows the network to extract features in parallel in different directions of attention, which helps the network to better capture global information, improves the network's ability to understand and represent the input data, and solves the problem of high inter-class similarity. Finally, the extracted features are fed into the classifier to complete the classification of indoor scene images. Experiments are conducted on four data sets (IndoorCVPR09, SUN397, 15-Scenes and self-built small sample scientific indoor scene dataset), yield excellent results. The results show that the developed algorithm effectively solves the two problems of high intra-class diversity and high inter-class similarity. As a result, the model has achieved competitive results. Preliminary application experiments are developed in our HRI system, indicating that the proposed indoor scene recognition model can be applied to the complete environmental perception in HRI.
What problem does this paper attempt to address?