Self-Supervison with data-augmentation improves few-shot learning

Prashant Kumar,Durga Toshniwal
DOI: https://doi.org/10.1007/s10489-024-05340-1
IF: 5.3
2024-02-29
Applied Intelligence
Abstract:Self-supervision learning (SSL) has shown exceptionally promising results in natural language processing and, more recently, in image classification and recognition. Recent research works have demonstrated SSL's benefits on large unlabeled datasets. However, relatively little investigation has been done into how well it works with smaller datasets. Typically, this challenge entails training a model on a very small quantity of data and then evaluating the model on out-of-distribution data. Few-shot image classification aims to classify classes that haven't been seen before using a limited number of training examples. Recent few-shot learning research focuses on developing good representation models that can quickly adapt to test tasks. In this paper, we investigate the role of self-supervision in the context of few-shot learning. We devised a model that improves the network's representation learning by employing a self-supervised auxiliary task that is based on composite rotation. We propose a composite rotation-based auxiliary task that rotates the image on two levels: inner and outer, and assigns one of 16 rotation classes to the modified image. Then, we further trained our model, which enables us to capture the robust learnable features that assist in focusing on better visual details of an object present in the given image. We find that the network is able to learn to extract more generalized and discriminative features, which in turn helps to enhance the effectiveness of its few-shot classification. This approach significantly outperforms the state-of-the-art on several public benchmarks. In addition, we demonstrated empirically that models trained using the proposed approach perform better than the baseline model even when the query examples in the episode are not aligned with the support examples. Extensive ablation experiments are performed to validate the various components of our approach. We also investigate our strategy's impact on the network's ability to discriminate visual features.
computer science, artificial intelligence
What problem does this paper attempt to address?