Development of a Bayesian network model to differentiate between benign and malignant pulmonary nodules

Wucui Huang,Ziyang Huang,Bowen Zhang,Lile Wang,Xiaoli Zhu,Pei Liu
DOI: https://doi.org/10.21203/rs.3.rs-2290494/v1
2022-01-01
Abstract:Abstract Objective: To develop a diagnostic prediction model for pulmonary nodules based on a Bayesian network (BN). Methods: A total of 981 patients with pulmonary nodules were divided into benign and malignant groups on the basis of pathological findings. The patients’ clinical data, radiological features of pulmonary nodules, serum tumor biomarkers and follow-up information were collected. Sixteen related variables were screened out by univariate regression analysis to develop a BN model for pulmonary nodules. To evaluate the effectiveness of this BN model, it was compared with other machine learning models and clinical prediction models. Results: Among the 981 patients, 334 had benign and 647 had malignant nodules. Sixteen variables were statistically significant between the two groups (p<0.05): age, history of tuberculosis, number of nodules, nodule location, maximum diameter, type, sign of lobulation, spiculation, pleural indentation, cavitation, vascular convergence, calcification, CEA, CYFRA21-1, interval between first and last follow-up, and change in the size of nodules during follow-up time. The BN model tested in this study performed well and its sensitivity, specificity and the area under the curve were 81.1%, 80.8% and 85.4%, respectively. Conclusion: We developed a BN model that integrates clinical data, computed tomography characteristics and serum biomarkers, and predicts the probability of pulmonary nodules being benign or malignant. The BN model performs better than clinical prediction models in distinguishing between benign and malignant pulmonary nodules. The predictive performance of the BN model was similar to that of common machine learning models and clinical models in prediction. Our BN model overcomes the barriers to clinical application inherent to black-box machine learning models, and visualization of this BN model enhances ease of use for clinicians.
What problem does this paper attempt to address?