Multi-scale Fusion Transformer Based Weakly Supervised Hashing Learning for Instance Retrieval

Yuanhai Lv,Chen Jiao,Wanqing Zhao,Wei Zhao,Ziyu Guan,Xiaofei He
DOI: https://doi.org/10.1007/s13042-023-01907-5
2023-01-01
International Journal of Machine Learning and Cybernetics
Abstract:Instance retrieval is concerned with obtaining representations of instances (objects) in images and using them for similarity comparisons between instances. However, most methods require instance-level categories to train the model, which increases the burden of annotation. Along with the advancement of convolutional neural networks and transformers in computer vision, in this work, we propose a hierarchical with a spatial pyramidal structure for weakly supervised multi-instance hash learning. It merges the advantages of local and multi-scale perception on CNN with the global field of view on Transformer. Further, it leverages the principle of multi-instance learning, allowing the proposed model to implement an instance-level hash mapping capability in a weakly supervised learning manner. The experimental results on three public datasets achieved more improved results compared to the typical methods, validating the effectiveness of the proposed method.
What problem does this paper attempt to address?