Self-Supervised Siamese Autoencoders

Friederike Baier,Sebastian Mair,Samuel G. Fadel
2023-04-06
Abstract:Fully supervised models often require large amounts of labeled training data, which tends to be costly and hard to acquire. In contrast, self-supervised representation learning reduces the amount of labeled data needed for achieving the same or even higher downstream performance. The goal is to pre-train deep neural networks on a self-supervised task such that afterwards the networks are able to extract meaningful features from raw input data. These features are then used as inputs in downstream tasks, such as image classification. Previously, autoencoders and Siamese networks such as SimSiam have been successfully employed in those tasks. Yet, challenges remain, such as matching characteristics of the features (e.g., level of detail) to the given task and data set. In this paper, we present a new self-supervised method that combines the benefits of Siamese architectures and denoising autoencoders. We show that our model, called SidAE (Siamese denoising autoencoder), outperforms two self-supervised baselines across multiple data sets, settings, and scenarios. Crucially, this includes conditions in which only a small amount of labeled data is available.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper attempts to address the issue in image recognition tasks where fully supervised models require a large amount of labeled data. Since labeled data is often costly and difficult to obtain, researchers have turned to self-supervised representation learning methods to reduce the amount of labeled data needed, thereby achieving or even surpassing the performance of fully supervised models. The paper proposes a new self-supervised method—**SidAE (Siamese Denoising Autoencoder)**, which combines the advantages of Siamese networks and denoising autoencoders. Through experimental validation, SidAE outperforms two self-supervised baseline methods across multiple datasets, settings, and scenarios, and performs particularly well when only a small amount of labeled data is available. Specifically, the paper demonstrates how SidAE achieves this goal through the following means: 1. **Self-supervised pre-training**: Utilizing unlabeled data for pre-training to extract meaningful features. 2. **Feature extraction and downstream tasks**: Using the extracted features for downstream tasks such as classification. 3. **Model structure**: Combining the advantages of Siamese networks and denoising autoencoders, enabling the model to handle inputs from different perspectives and maintain robustness to noise. In summary, the paper aims to showcase SidAE as an effective self-supervised learning method that performs excellently in various scenarios, especially when labeled data is limited.