Abstract:Background Recently, many studies have explored fusing features extracted from Convolutional neural networks (CNNs) and transformers to integrate multi‐scale representations for better performance in medical image segmentation tasks. Although these hybrid models have achieved better results than previous CNN‐based and transformer‐based methods, they suffer from high computation and space complexities. Purpose The purpose of this research is to address the prohibitive computation and space complexities of hybrid models, which limit their application in clinical practice where computational resources are usually constrained. Methods We propose a novel model equipped with a dual distillation scheme to sufficiently harness the complementary advantages of CNNs and transformers without compromising model efficiency. We further propose a multi‐scale prior‐knowledge distillation (MPD) module to effectively distill multi‐scale knowledge from features extracted from transformers. In addition, to cooperate with the knowledge distillation scheme, we also propose an efficient and robust Selective Fusion module in the student network. Results We extensively evaluate the proposed model against fourteen different network frameworks on two representative datasets: SipakMed and ISIC 2017. In the SipakMed dataset, 3037 Pap smear images are used for training and 1012 for testing. In the ISIC 2017 dataset, 2000 dermoscopic images are used for training, 150 for validation, and 600 for testing. Experimental results demonstrate that our method not only surpasses existing methods by a considerable margin with respect to the evaluation metrics of mean Intersection over Union, mean Dice coefficient, mean average symmetric surface distance, but also requires fewer computational resources in terms of model parameters and floating‐point operations per second. Conclusions Comprehensive comparisons in terms of segmentation accuracy and computational complexity unequivocally confirm that our method effectively and efficiently integrates the advantages of both CNNs and transformers, showing its suitability and significance for clinical applications.

Graph Flow: Cross-layer Graph Flow Distillation for Dual Efficient Medical Image Segmentation

MSKD: Structured knowledge distillation for efficient medical image segmentation

Adaptive Decomposition and Shared Weight Volumetric Transformer Blocks for Efficient Patch-Free 3D Medical Image Segmentation.

Efficient Medical Image Segmentation Based on Knowledge Distillation

Pixel-Level and Affinity-Level Knowledge Distillation for Unsupervised Segmentation of Covid-19 Lesions

Exploring Generalizable Distillation for Efficient Medical Image Segmentation

Graph Relation Distillation for Efficient Biomedical Instance Segmentation

A Medical Image Segmentation Method Combining Knowledge Distillation and Contrastive Learning

Efficient knowledge distillation for liver CT segmentation using growing assistant network

Cross-denoising Network against Corrupted Labels in Medical Image Segmentation with Domain Shift

Distillation Learning Guided by Image Reconstruction for One-Shot Medical Image Segmentation

Multi‐scale contextual learning for medical image segmentation via dual distillation

GID: Global information distillation for medical semantic segmentation

ShiftTransUNet: An Efficient Deep Learning Model for Medical Image Segmentation Using ShiftViT Framework

Efficient skin lesion segmentation with boundary distillation

Efficient Biomedical Instance Segmentation via Knowledge Distillation

RSKD: Enhanced medical image segmentation via multi-layer, rank-sensitive knowledge distillation in Vision Transformer models

Interactive segmentation of medical images using deep learning

Multi-Task Multi-Scale Contrastive Knowledge Distillation for Efficient Medical Image Segmentation

Enhancing Tiny Tissues Segmentation via Self-Distillation.

FKD-Med: Privacy-Aware, Communication-Optimized Medical Image Segmentation via Federated Learning and Model Lightweighting Through Knowledge Distillation