Cross Modal Retrieval Algorithm Based on Iterative Queries

Xiuchuan Cheng,Xiaoyu Yang,Huiping Li,Zhiguo Wang,Guangqiang Yin
DOI: https://doi.org/10.1007/978-981-99-9243-0_33
2024-01-01
Abstract:The single-modal information retrieval pattern is gradually unable to meet the growing information processing needs. Cross-modal retrieval based on deep learning, as a new information retrieval scheme, is gradually receiving more attention. To address the potential issue of imprecise text queries in cross-modal retrieval, an iterative query-based cross-modal retrieval model is proposed. The model is generally divided into four modules: image feature extraction, text feature extraction, matching ranking, and query reinforcement. The model first extracts feature of images and text through deep learning models, then performs matching and retrieval of image-text features through the image-text stacked cross-attention algorithm. Finally, in the query reinforcement module, the most distinctive object category in the retrieval results is obtained through deep reinforcement learning for user confirmation, thereby increasing text richness and improving retrieval performance.
What problem does this paper attempt to address?