DeSD: Self-Supervised Learning with Deep Self-Distillation for 3D Medical Image Segmentation

Yiwen Ye,Jianpeng Zhang,Ziyang Chen,Yong Xia
DOI: https://doi.org/10.1007/978-3-031-16440-8_52
2022-01-01
Abstract:Self-supervised learning (SSL), enabling advanced performance with few annotations, has demonstrated a proven successful in medical image segmentation. Usually, SSL relies on measuring the similarity of features obtained at the deepest layer to attract the features of positive pairs or repulse the features of negative pairs, and then may suffer from the weak supervision at shallow layers. To address this issue, we reformulate SSL in a Deep Self-Distillation (DeSD) manner to improve the representation quality of both shallow and deep layers. Specifically, the DeSD model is composed of an online student network and a momentum teacher network, both being stacked by multiple sub-encoders. The features produced by each sub-encoder in the student network are trained to match the features produced by the teacher network. Such a deep self-distillation supervision is able to improve the representation quality of all sub-encoders, including both shallow ones and deep ones. We pretrain the DeSD model on a large-scale unlabeled dataset and evaluate it on seven downstream segmentation tasks. Our results indicate that the proposed DeSD model achieves superior pre-training performance over existing SSL methods, setting the new state of the art. The code is available at https://github.com/yeerwen/DeSD.
What problem does this paper attempt to address?