Mixture of Self-Supervised Learning

Aristo Renaldo Ruslim,Novanto Yudistira,Budi Darma Setiawan
2023-07-27
Abstract:Self-supervised learning is popular method because of its ability to learn features in images without using its labels and is able to overcome limited labeled datasets used in supervised learning. Self-supervised learning works by using a pretext task which will be trained on the model before being applied to a specific task. There are some examples of pretext tasks used in self-supervised learning in the field of image recognition, namely rotation prediction, solving jigsaw puzzles, and predicting relative positions on image. Previous studies have only used one type of transformation as a pretext task. This raises the question of how it affects if more than one pretext task is used and to use a gating network to combine all pretext tasks. Therefore, we propose the Gated Self-Supervised Learning method to improve image classification which use more than one transformation as pretext task and uses the Mixture of Expert architecture as a gating network in combining each pretext task so that the model automatically can study and focus more on the most useful augmentations for classification. We test performance of the proposed method in several scenarios, namely CIFAR imbalance dataset classification, adversarial perturbations, Tiny-Imagenet dataset classification, and semi-supervised learning. Moreover, there are Grad-CAM and T-SNE analysis that are used to see the proposed method for identifying important features that influence image classification and representing data for each class and separating different classes properly. Our code is in <a class="link-external link-https" href="https://github.com/aristorenaldo/G-SSL" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address several key issues in self-supervised learning for image classification: 1. **Feature Learning**: Self-supervised learning methods can learn image features through pretext tasks without using labeled data. These pretext tasks include rotation prediction, jigsaw puzzles, etc. However, existing research only uses a single type of transformation as a pretext task, which limits the model's ability to learn various features. 2. **Multi-task Combination**: The paper proposes a method called "Gated Self-Supervised Learning" to improve image classification performance by combining multiple pretext tasks. Specifically, this method uses various transformations such as Localizable Rotation, Horizontal Flip, and RGB Channel Permutation. It employs a gating network in a Mixture of Experts architecture to automatically weight the importance of each pretext task. 3. **Experimental Validation**: To validate the effectiveness of the proposed method, the authors conducted tests in multiple scenarios, including imbalanced CIFAR dataset classification, adversarial perturbations, Tiny-Imagenet dataset classification, and semi-supervised learning. Additionally, Grad-CAM and T-SNE analyses were used to visualize and evaluate the model's ability to recognize important features. In summary, this paper focuses on enhancing self-supervised learning performance in image classification by combining multiple pretext tasks and demonstrates the effectiveness and superiority of the proposed method through experiments.