Knowledge distillation for semi-supervised domain adaptation

Mauricio Orbes-Arteaga,Jorge Cardoso,Lauge Sørensen,Christian Igel,Sebastien Ourselin,Marc Modat,Mads Nielsen,Akshay Pai

DOI: https://doi.org/10.48550/arXiv.1908.07355

2019-08-16

Abstract:In the absence of sufficient data variation (e.g., scanner and protocol variability) in annotated data, deep neural networks (DNNs) tend to overfit during training. As a result, their performance is significantly lower on data from unseen sources compared to the performance on data from the same source as the training data. Semi-supervised domain adaptation methods can alleviate this problem by tuning networks to new target domains without the need for annotated data from these domains. Adversarial domain adaptation (ADA) methods are a popular choice that aim to train networks in such a way that the features generated are domain agnostic. However, these methods require careful dataset-specific selection of hyperparameters such as the complexity of the discriminator in order to achieve a reasonable performance. We propose to use knowledge distillation (KD) -- an efficient way of transferring knowledge between different DNNs -- for semi-supervised domain adaption of DNNs. It does not require dataset-specific hyperparameter tuning, making it generally applicable. The proposed method is compared to ADA for segmentation of white matter hyperintensities (WMH) in magnetic resonance imaging (MRI) scans generated by scanners that are not a part of the training set. Compared with both the baseline DNN (trained on source domain only and without any adaption to target domain) and with using ADA for semi-supervised domain adaptation, the proposed method achieves significantly higher WMH dice scores.

Machine Learning,Image and Video Processing

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the performance degradation of deep neural networks (DNNs) in medical image segmentation tasks when the training data and the test data come from different distributions. Specifically, due to the high cost of labeling medical imaging data and the difficulty in obtaining a large amount of labeled data, the training data sets often cannot fully cover all possible data variations (such as differences in scanners and protocols). This causes DNNs to perform poorly when dealing with unseen data sources. To solve this problem, the paper proposes a method based on knowledge distillation (KD) for semi - supervised domain adaptation to reduce the dependence on labeled data in the target domain and improve the generalization ability of the model on new data sources. By transferring knowledge from a teacher model trained on the source domain to a student model, this method aims to enable the student model to better adapt to the data characteristics of the target domain, thereby improving the segmentation performance in the target domain. The paper evaluates the performance of baseline models (trained only on the source domain), adversarial domain adaptation (ADA) methods, and the proposed KD method in white matter hyperintensities (WMH) segmentation tasks in multiple different clinical scenarios. The experimental results show that, except for the domain adaptation scenario from the Utrecht clinic to the Singapore clinic, the proposed KD method outperforms the ADA method in most cases, especially when dealing with small lesions. In addition, a significant advantage of the KD method over the ADA method is that its design is relatively simple, and it does not require extensive adjustments to the network architecture. Good performance can be achieved simply by selecting an appropriate temperature parameter.

Knowledge distillation for semi-supervised domain adaptation

MCKD: Mutually Collaborative Knowledge Distillation for Federated Domain Adaptation and Generalization

GKD: Semi-supervised Graph Knowledge Distillation for Graph-Independent Inference

Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data

Knowledge Distillation for Adaptive MRI Prostate Segmentation Based on Limit-Trained Multi-Teacher Models

Adaptive Knowledge Distillation for High-Quality Unsupervised MRI Reconstruction with Model-Driven Priors

Kd3a: Unsupervised Multi-Source Decentralized Domain Adaptation Via Knowledge Distillation

MSCDA: Multi-level Semantic-guided Contrast Improves Unsupervised Domain Adaptation for Breast MRI Segmentation in Small Datasets

Cross-Domain and Cross-Modal Knowledge Distillation in Domain Adaptation for 3D Semantic Segmentation

KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Adaptive Affinity-Based Generalization For MRI Imaging Segmentation Across Resource-Limited Settings

Unsupervised Domain Adaptation for Brain Structure Segmentation Via Mutual Information Maximization Alignment

Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss

Unsupervised Federated Domain Adaptation for Segmentation of MRI Images

Data privacy protection domain adaptation by roughing and finishing stage

Multi-source Distilling Domain Adaptation

Enhancement and evaluation for deep learning-based classification of volumetric neuroimaging with 3D-to-2D knowledge distillation

Unpaired Multi-modal Segmentation via Knowledge Distillation

Research on a Cross-Domain Few-Shot Adaptive Classification Algorithm Based on Knowledge Distillation Technology

Towards Cross-modality Medical Image Segmentation with Online Mutual Knowledge Distillation

Knowledge mapping-based adversarial domain adaptation: A novel fault diagnosis method with high generalizability under variable working conditions