Local descriptor-based spatial cross attention network for few-shot learning

DOI: https://doi.org/10.1007/s13042-024-02189-1
2024-05-12
International Journal of Machine Learning and Cybernetics
Abstract:Few-shot learning aims to classify novel images based on a small number of labeled examples. While recent work has shown promise using local descriptors, existing methods generally classify local descriptors independently, which potentially can loss the spatial and other essential information for new tasks. Moreover, such works ignore the semantics expressed by local descriptors may be irrelevant to image semantics. In this paper, we propose two methods to address these challenges. Firstly, we design a novel Spatial Cross Attention Module to generate a spatial cross attention map between a query and a class representation to enhance the local descriptors that are most relevant to each task. Then, we employ dense classification loss, which supervises the learning of all local descriptors, to constrain the semantic consistency of local descriptors. Furthermore, we show that the feature extractor trained by our method can be extended to some new baseline methods to achieve better performance. Extensive experiments conducted on three widely used few-shot learning benchmark datasets indicate that our proposed method achieves the competitive results.
computer science, artificial intelligence
What problem does this paper attempt to address?