Abstract:A large labeled dataset is a key to the success of supervised deep learning, but for medical image segmentation, it is highly challenging to obtain sufficient annotated images for model training. In many scenarios, unannotated images are abundant and easy to acquire. Self-supervised learning (SSL) has shown great potentials in exploiting raw data information and representation learning. In this paper, we propose Hierarchical Self-Supervised Learning (HSSL), a new self-supervised framework that boosts medical image segmentation by making good use of unannotated data. Unlike the current literature on task-specific self-supervised pretraining followed by supervised fine-tuning, we utilize SSL to learn task-agnostic knowledge from heterogeneous data for various medical image segmentation tasks. Specifically, we first aggregate a dataset from several medical challenges, then pre-train the network in a self-supervised manner, and finally fine-tune on labeled data. We develop a new loss function by combining contrastive loss and classification loss, and pre-train an encoder-decoder architecture for segmentation tasks. Our extensive experiments show that multi-domain joint pre-training benefits downstream segmentation tasks and outperforms single-domain pre-training significantly. Compared to learning from scratch, our method yields better performance on various tasks (e.g., +0.69%documentclass[12pt]{minimal}usepackage{amsmath}usepackage{wasysym}usepackage{amsfonts}usepackage{amssymb}usepackage{amsbsy}usepackage{mathrsfs}usepackage{upgreek}setlength{oddsidemargin}{-69pt}egin{document}$$+0.69\%$$end{document} to +18.60%documentclass[12pt]{minimal}usepackage{amsmath}usepackage{wasysym}usepackage{amsfonts}usepackage{amssymb}usepackage{amsbsy}usepackage{mathrsfs}usepackage{upgreek}setlength{oddsidemargin}{-69pt}egin{document}$$+18.60\%$$end{document} in Dice with 5%documentclass[12pt]{minimal}usepackage{amsmath}usepackage{wasysym}usepackage{amsfonts}usepackage{amssymb}usepackage{amsbsy}usepackage{mathrsfs}usepackage{upgreek}setlength{oddsidemargin}{-69pt}egin{document}$$5\%$$end{document} of annotated data). With limited amounts of training data, our method can substantially bridge the performance gap with respect to denser annotations (e.g., 10%documentclass[12pt]{minimal}usepackage{amsmath}usepackage{wasysym}usepackage{amsfonts}usepackage{amssymb}usepackage{amsbsy}usepackage{mathrsfs}usepackage{upgreek}setlength{oddsidemargin}{-69pt}egin{document}$$10\%$$end{document} vs. 100%documentclass[12pt]{minimal}usepackage{amsmath}usepackage{wasysym}usepackage{amsfonts}usepackage{amssymb}usepackage{amsbsy}usepackage{mathrsfs}usepackage{upgreek}setlength{oddsidemargin}{-69pt}egin{document}$$100\%$$end{document} annotations).

Keypoint-Augmented Self-Supervised Learning for Medical Image Segmentation with Limited Annotation

Contrastive learning of global and local features for medical image segmentation with limited annotations

A General Global and Local Pre-Training Framework for 3D Medical Image Segmentation.

Self-Supervised Alignment Learning for Medical Image Segmentation

Hierarchical Self-supervised Learning for Medical Image Segmentation Based on Multi-domain Data Aggregation

Localized Region Contrast for Enhancing Self-Supervised Learning in Medical Image Segmentation

Self-supervised Learning for Few-shot Medical Image Segmentation

Contrastive Semi-Supervised Learning for Domain Adaptive Segmentation Across Similar Anatomical Structures

Leveraging Unlabeled Data for 3D Medical Image Segmentation through Self-Supervised Contrastive Learning

Semi-Mamba-UNet: Pixel-Level Contrastive Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation

Self-Ensembling Contrastive Learning for Semi-Supervised Medical Image Segmentation

Spatial-Frequency Dual Progressive Attention Network For Medical Image Segmentation

Bootstrap Representation Learning for Segmentation on Medical Volumes and Sequences

Positional Information is a Strong Supervision for Volumetric Medical Image Segmentation

PA-Seg: Learning from Point Annotations for 3D Medical Image Segmentation using Contextual Regularization and Cross Knowledge Distillation

A Unified Visual Information Preservation Framework for Self-supervised Pre-training in Medical Image Analysis

Two-Stage Multi-task Self-Supervised Learning for Medical Image Segmentation

MIS-FM: 3D Medical Image Segmentation using Foundation Models Pretrained on a Large-Scale Unannotated Dataset

SC-SSL: Self-correcting Collaborative and Contrastive Co-training Model for Semi-Supervised Medical Image Segmentation