Enhancement and evaluation for deep learning-based classification of volumetric neuroimaging with 3D-to-2D knowledge distillation

Hyemin Yoon,Do-Young Kang,Sangjin Kim
DOI: https://doi.org/10.1038/s41598-024-80938-6
IF: 4.6
2024-11-30
Scientific Reports
Abstract:The application of deep learning techniques for the analysis of neuroimaging has been increasing recently. The 3D Convolutional Neural Network (CNN) technology, which is commonly adopted to encode volumetric information, requires a large number of datasets. However, due to the nature of the medical domain, there are limitations in the number of data available. This is because the cost of acquiring imaging is expensive and the use of personnel to annotate diagnostic labels is resource-intensive. For these reasons, several prior studies have opted to use comparatively lighter 2D CNNs instead of the complex 3D CNN technology. They analyze using projected 2D datasets created from representative slices extracted from 3D volumetric imaging. However, this approach, by selecting only projected 2D slices from the entire volume, reflects only partial volumetric information. This poses a risk of developing lesion diagnosis systems without a deep understanding of the interrelations among volumetric data. We propose a novel 3D-to-2D knowledge distillation framework that utilizes not only the projected 2D dataset but also the original 3D volumetric imaging dataset. This framework is designed to employ volumetric prior knowledge in training 2D CNNs. Our proposed method includes three modules: (i) a 3D teacher network that encodes volumetric prior knowledge from the 3D dataset, (ii) a 2D student network that encodes partial volumetric information from the 2D dataset, and aims to develop an understanding of the original volumetric imaging, and (iii) a distillation loss introduced to reduce the gap in the graph representation expressing the relationship between data in the feature embedding spaces of (i) and (ii), thereby enhancing the final performance. The effectiveness of our proposed method is demonstrated by improving the classification performance orthogonally across various 2D projection methods on two datasets from the 123I-DaTscan SPECT and 18 F-AV133 PET from Parkinson's Progression Markers Initiative (PPMI). Notably, when our approach is applied to the FuseMe approach, it achieves an F1 score of 98.30%, which is higher than that of the 3D teacher network (97.66%).
multidisciplinary sciences
What problem does this paper attempt to address?