Automated Machine-Learning Framework Integrating Histopathological and Radiological Information for Predicting IDH1 Mutation Status in Glioma

Dingqian Wang,Cuicui Liu,Xiuying Wang,Xuejun Liu,Chuanjin Lan,Peng Zhao,William C Cho,Manuel B Graeber,Yingchao Liu
DOI: https://doi.org/10.3389/fbinf.2021.718697
2021-10-26
Abstract:Diffuse gliomas are the most common malignant primary brain tumors. Identification of isocitrate dehydrogenase 1 (IDH1) mutations aids the diagnostic classification of these tumors and the prediction of their clinical outcomes. While histology continues to play a key role in frozen section diagnosis, as a diagnostic reference and as a method for monitoring disease progression, recent research has demonstrated the ability of multi-parametric magnetic resonance imaging (MRI) sequences for predicting IDH genotypes. In this paper, we aim to improve the prediction accuracy of IDH1 genotypes by integrating multi-modal imaging information from digitized histopathological data derived from routine histological slide scans and the MRI sequences including T1-contrast (T1) and Fluid-attenuated inversion recovery imaging (T2-FLAIR). In this research, we have established an automated framework to process, analyze and integrate the histopathological and radiological information from high-resolution pathology slides and multi-sequence MRI scans. Our machine-learning framework comprehensively computed multi-level information including molecular level, cellular level, and texture level information to reflect predictive IDH genotypes. Firstly, an automated pre-processing was developed to select the regions of interest (ROIs) from pathology slides. Secondly, to interactively fuse the multimodal complementary information, comprehensive feature information was extracted from the pathology ROIs and segmented tumor regions (enhanced tumor, edema and non-enhanced tumor) from MRI sequences. Thirdly, a Random Forest (RF)-based algorithm was employed to identify and quantitatively characterize histopathological and radiological imaging origins, respectively. Finally, we integrated multi-modal imaging features with a machine-learning algorithm and tested the performance of the framework for IDH1 genotyping, we also provided visual and statistical explanation to support the understanding on prediction outcomes. The training and testing experiments on 217 pathologically verified IDH1 genotyped glioma cases from multi-resource validated that our fully automated machine-learning model predicted IDH1 genotypes with greater accuracy and reliability than models that were based on radiological imaging data only. The accuracy of IDH1 genotype prediction was 0.90 compared to 0.82 for radiomic result. Thus, the integration of multi-parametric imaging features for automated analysis of cross-modal biomedical data improved the prediction accuracy of glioma IDH1 genotypes.
What problem does this paper attempt to address?