Simulation of cross-modal image-text retrieval algorithm under convolutional neural network structure and hash method
XianBen Yang,Wei Zhang
DOI: https://doi.org/10.1007/s11227-021-04157-w
IF: 3.3
2021-11-05
The Journal of Supercomputing
Abstract:The purpose of this work is to quickly find useful information from the massive image database in view of the images, videos, and other multimedia data generated on Internet platforms, such as Wechat, Sina Weibo and Twitter. The hash algorithm is utilized to map image high-dimensional features into binary code strings to deal with problems of traditional image retrieval algorithms, such as high feature dimension, large storage space, and low retrieval efficiency. Besides, an end-to-end deep hash learning network is designed based on semantic preservation, and the deep Convolutional Neural Network (CNN) is adopted for image feature extraction and hash function learning. Moreover, a binary-constrained regularization term is added to the loss function, and a semantic reservation layer is supplemented to optimize the generation of hash code. Furthermore, a semi-supervised hash learning network is proposed based on Generative Adversarial Network (GAN), which takes the hash network as a discriminator, and a discriminating node is introduced into the output layer to discriminate the true and false samples. For each innovation of the algorithm, retrieval experiments are carried out on several public datasets to further verify the performance of the proposed algorithm to prove the effectiveness. The results reveal that: the loss function of adaptive weight based on hash network reduces the influence of imbalances positive and negative sample on retrieval performance, the constrained regular terms reduce the error caused by quantization, avoids information loss caused by "Relaxation" strategy in traditional methods, and optimizes hash structure. In this way, the obtained hash code can retain semantic similarity and improve the retrieval accuracy. Hash network based on GAN can improve the overall performance of the model by about 3%. Compared with the existing image retrieval algorithm, the deep hash learning algorithm has better effect in image retrieval than the current hash method. The study effectively solves the problems in the image and text information retrieval, and can provide some ideas and methods for the research of the text retrieval.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture