Siamese self-supervised learning for fine-grained visual classification

Ruyi Ji,Jiaying Li,Libo Zhang
DOI: https://doi.org/10.1016/j.cviu.2023.103658
IF: 4.886
2023-03-01
Computer Vision and Image Understanding
Abstract:Fine-grained visual classification (FGVC) is challenging to capture subtle yet distinct visual cues due to large intra-class and small inter-class variances. To this end, we propose a new Siamese Self-supervised Learning method to perform alignment between different views of one image. Specifically, we employ the attention mechanism to explore the semantic parts of one image, and then generate different views by crop and erase strategy. Meanwhile, we adopt the Siamese network to perform the feature alignment across various views and capture the view-invariant feature in a self-supervised way. Finally, we introduce the center loss to explicitly ensure consistency between different views. Extensive experimental results show the proposed method performs on par with the state-of-the-art methods on three public benchmarks including CUB-200-2011, FGVC-Aircraft, and Stanford Cars.
computer science, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?