Self-Supervised Learning as a Means To Reduce the Need for Labeled Data in Medical Image Analysis

Marin Benčević,Marija Habijan,Irena Galić,Aleksandra Pizurica
DOI: https://doi.org/10.48550/arXiv.2206.00344
2022-06-01
Abstract:One of the largest problems in medical image processing is the lack of annotated data. Labeling medical images often requires highly trained experts and can be a time-consuming process. In this paper, we evaluate a method of reducing the need for labeled data in medical image object detection by using self-supervised neural network pretraining. We use a dataset of chest X-ray images with bounding box labels for 13 different classes of anomalies. The networks are pretrained on a percentage of the dataset without labels and then fine-tuned on the rest of the dataset. We show that it is possible to achieve similar performance to a fully supervised model in terms of mean average precision and accuracy with only 60\% of the labeled data. We also show that it is possible to increase the maximum performance of a fully-supervised model by adding a self-supervised pretraining step, and this effect can be observed with even a small amount of unlabeled data for pretraining.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?