Alleviating Over-fitting in Hashing-based Fine-grained Image Retrieval: from Causal Feature Learning to Binary-injected Hash Learning

Xinguang Xiang,Xinhao Ding,Lu Jin,Zechao Li,Jinhui Tang,Ramesh Jain
DOI: https://doi.org/10.1109/tmm.2024.3410136
IF: 7.3
2024-01-01
IEEE Transactions on Multimedia
Abstract:Hashing-based fine-grained image retrieval pursues learning diverse local features to generate inter-class discriminative hash codes. However, existing fine-grained hash methods with attention mechanisms usually tend to just focus on a few obvious areas, which misguides the network to over-fit some salient features. Such a problem raises two main limitations: 1) It overlookssomesubtlelocalfeatures,degradingthegeneralization capability of learned embedding. 2) It causes the over-activation of some hash bits correlated to salient features, which breaks the binary code balance and further weakens the discrimination abilities of hash codes. To address these limitations of the overfitting problem, we propose a novel hash framework (CFBH) from Causal Feature learning to Binary-injected Hash learning, which captures various local information and suppresses overactivated hash bits simultaneously. For causal feature learning, we adopt causal inference theory to alleviate the bias towards the salient regions in fine-grained images. In detail, we obtain local features from the feature map and combine this local information with original image information followed by this theory. Theoretically, these fused embeddings help the network to re-weight the retrieval effort of each local feature and exploit more subtle variations without observational bias. For binaryinjectedhashlearning,weproposeaBinaryNoiseInjection(BNI) module inspired by Dropout. The BNI module not only mitigates over-activation to particular bits, but also makes hash codes uncorrelated and balanced in the Hamming space. Extensive experimental results on six popular fine-grained image datasets demonstrate the superiority of CFBH over several state-of-theart methods. The source code will be publicly available upon publication.
What problem does this paper attempt to address?