Image annotation of ancient chinese architecture based on visual attention mechanism and GCN

Sulan Zhang,Songzan Chen,Jifu Zhang,Zhenjiao Cai,Lihua Hu
DOI: https://doi.org/10.1007/s11042-022-12618-4
IF: 2.577
2022-05-07
Multimedia Tools and Applications
Abstract:Ancient Chinese architecture(ACA), especially like roof ridge decoration, vividly exhibits Chinese civilization and the typical, accurate image semantics can well reflect the historical style of ACA at that time. However, the current research on the 2D image annotation method of ACA lacks the annotation of historical and cultural information(such as dynasties, regions, e t c .). In addition, with the enrichment of ACA labels, the number of irrelevant labels will increase. To solve these problems, we propose an ACA image annotation method based on visual attention mechanism and graph convolutional network (GCN). Firstly, according to the uniqueness of the roof ridge decoration of ACA, we introduce the visual attention mechanism into the convolution neural network (CNN) to focus on the roof ridge decoration area and the corresponding image features are extracted. Secondly, to avoid the output of irrelevant labels, we construct a correlation matrix in the GCN to transfer the correlation between the labels of ACA and then obtain the label-related classifier. Finally, the classifier is applied to the extracted image features for multi-label loss training. Experiments on six types of ACA datasets demonstrate that the proposed method can effectively improve the annotation accuracy and enrich the semantic information of ACA.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?