A new method to screen high-risk COPD populations: machine learning-based cascade classification models based on low-dose CT scan

Yu Pu,Xiuxiu Zhou,Di Zhang,Yu Guan,Yi Xia,Yang Lu,Xuebin Zheng,Chuan He,Shiyuan Liu,Li Fan
DOI: https://doi.org/10.1007/s42058-023-00134-9
2024-01-24
Chinese Journal of Academic Radiology
Abstract:PurposeTo explore the feasibility of machine learning-based cascade classification models for screening high-risk COPD populations.Materials and methodsA total of 1637 community residents with available demographic data, smoking history, and pulmonary function tests (PFT) who underwent low-dose chest computed tomography (CT) from 2018 to 2020 were included. All subjects were divided into COPD and non-COPD groups according to their FEV1/FVC threshold of 0.7. Furthermore, the non-COPD groups were further subdivided into normal and high-risk COPD groups subgroups according to FEV1% predicted value (FEV1% pre) thresholds of 72%, 80%, and 95%, respectively. Based on the basic information and CT quantitative parameters of subjects, random forest model 1 (RF_1) was established to distinguish COPD from non-COPD groups, and RF_2 was established to distinguish high-risk COPD from normal groups. Then, we combined RF_1 and RF_2 to form triple classification model using cascade classification method. Subjects were randomly divided into training and test sets in the ratio of 8:2. Model performances were evaluated using AUC, accuracy, sensitivity, and specificity.ResultsThe accuracy of the triple classification model was 0.63 for FEV1/FVC threshold of 0.7 and FEV1% threshold of 72%. For FEV1/FVC threshold of 0.7 and FEV1% threshold of 80%, accuracy of the model was 0.51. For FEV1/FVC threshold of 0.7 and FEV1% threshold of 95%, accuracy of the model was 0.58.ConclusionsMachine learning-based cascade classification models is a potential method to screen high-risk COPD populations from general population. This method lays a foundation for a uniform method to screen high-risk COPD populations.
What problem does this paper attempt to address?