Network-Agnostic Knowledge Transfer for Medical Image Segmentation

Shuhang Wang,Vivek Kumar Singh,Alex Benjamin,Mercy Asiedu,Elham Yousef Kalafi,Eugene Cheah,Viksit Kumar,Anthony Samir
DOI: https://doi.org/10.48550/arXiv.2101.09560
2021-01-24
Abstract:Conventional transfer learning leverages weights of pre-trained networks, but mandates the need for similar neural architectures. Alternatively, knowledge distillation can transfer knowledge between heterogeneous networks but often requires access to the original training data or additional generative networks. Knowledge transfer between networks can be improved by being agnostic to the choice of network architecture and reducing the dependence on original training data. We propose a knowledge transfer approach from a teacher to a student network wherein we train the student on an independent transferal dataset, whose annotations are generated by the teacher. Experiments were conducted on five state-of-the-art networks for semantic segmentation and seven datasets across three imaging modalities. We studied knowledge transfer from a single teacher, combination of knowledge transfer and fine-tuning, and knowledge transfer from multiple teachers. The student model with a single teacher achieved similar performance as the teacher; and the student model with multiple teachers achieved better performance than the teachers. The salient features of our algorithm include: 1)no need for original training data or generative networks, 2) knowledge transfer between different architectures, 3) ease of implementation for downstream tasks by using the downstream task dataset as the transferal dataset, 4) knowledge transfer of an ensemble of models, trained independently, into one student model. Extensive experiments demonstrate that the proposed algorithm is effective for knowledge transfer and easily tunable.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the knowledge transfer problem in medical image segmentation, especially how to achieve knowledge transfer between different neural network architectures in the absence of original training data or generative networks. Specifically, the paper proposes a new knowledge transfer method. This method trains the student model by using the pseudo - label data set generated by the teacher model, thereby achieving the transfer of knowledge from the teacher model to the student model. This method can not only carry out knowledge transfer across different network architectures, but also reduces the dependence on the original training data, making knowledge transfer more flexible and efficient. The algorithm proposed in the paper has the following characteristics: 1. It does not require original training data or additional generative networks; 2. It can carry out knowledge transfer between neural networks of different architectures; 3. It is easy to implement when performing downstream tasks, especially when using the data set of the downstream task as the transfer data set; 4. It can transfer the knowledge of multiple independently - trained models to one student model. The experimental results show that the student model trained with a single teacher model can achieve performance similar to that of the teacher model; while the student model trained with multiple teacher models can show better performance than any single teacher model. These results indicate that the proposed algorithm is effective and easy to adjust for knowledge transfer in medical image segmentation tasks.