Development of machine learning model to predict pulmonary function with low‐dose CT‐derived parameter response mapping in a community‐based chest screening cohort
Xiuxiu Zhou,Yu Pu,Di Zhang,Yu Guan,Yang Lu,Weidong Zhang,Chi‐Cheng Fu,Qu Fang,Hanxiao Zhang,Shiyuan Liu,Li Fan
DOI: https://doi.org/10.1002/acm2.14171
2023-10-04
Journal of Applied Clinical Medical Physics
Abstract:Purpose To construct and evaluate the performance of a machine learning‐based low dose computed tomography (LDCT)‐derived parametric response mapping (PRM) model for predicting pulmonary function test (PFT) results. Materials and methods A total of 615 subjects from a community‐based screening population (40–74 years old) with PFT parameters, including the ratio of the first second forced expiratory volume to forced vital capacity (FEV1/FVC), the percentage of forced expiratory volume in the one second predicted (FEV1%), and registered inspiration‐to‐expiration chest CT scanning were enrolled retrospectively. Subjects were classified into a normal, high risk, and COPD group based on PFT. Data of 72 PRM‐derived quantitative parameters were collected, including volume and volume percentage of emphysema, functional‐small airways disease, and normal lung tissue. A machine‐learning with random forest regression model and a multilayer perceptron (MLP) model were constructed and tested on PFT prediction, which was followed by evaluation of classification performance based on the PFT predictions. Results The machine‐learning model based on PRM parameters showed better performance for predicting PFT than MLP, with a coefficient of determination (R2) of 0.749 and 0.792 for FEV1/FVC and FEV1%, respectively. The Mean Squared Errors (MSE) for FEV1/FVC and FEV1% are 0.0030 and 0.0097 for the random forest model, respectively. The Root Mean Squared Errors (RMSE) for FEV1/FVC and FEV1% are 0.055 and 0.098, respectively. The sensitivity, specificity, and accuracy for differentiating between the normal group and high‐risk group were 34/40 (85%), 65/72 (90%), and 99/112 (88%), respectively. For differentiating between the non‐COPD group and COPD group, the sensitivity, specificity, and accuracy were 8/9 (89%), 112/112 (100%), 120/121 (99%), respectively. Conclusions The machine learning‐based random forest model predicts PFT results in a community screening population based on PRM, and it identifies high risk COPD from normal populations with high sensitivity and reliably predicts of high‐risk COPD.
radiology, nuclear medicine & medical imaging