Machine Learning Models for Breast Lesions Based on Ultrasound Imaging Features: A Observational Study
Yao Tan,Ling Huo,Shu Wang,Cuizhi Geng,Yi Li,XiangJun Ma,Bin Wang,YingJian He,Chen Yao,Tao Ouyang
DOI: https://doi.org/10.21203/rs.3.rs-101184/v1
2020-01-01
Abstract:Background: The accuracy of breast cancer (BC) screening based on conventional ultrasound imaging examination largely depends on the experience of clinicians. Further, the effectiveness of BC screening and diagnosis in primary hospitals need to be improved. This study aimed to establish and evaluate the usefulness of a simple, practical, and easy-to-promote machine learning model based on ultrasound imaging features for diagnosing BC.Methods: Logistic regression, random forest, extra trees, support vector, multilayer perceptron, and XG boost models were developed. The modeling data set was divided into a training set and test set in a 75%:25% ratio, and these were used to establish the models and test their performance, respectively. The validation data set of primary hospitals was used for external validation of the model. The area under the receiver operating characteristic curve (AUC) was used as the main evaluation index, and pathological biopsy was used as the gold standard for evaluating each model. Diagnostic capability was also compared with those of clinicians. Results: Among the six models, the logistic model showed superior capability, with an AUC of 0.771 and 0.906 in the test and validation sets, respectively, and Brier scores of 0.18 and 0.165. The AUC of the logistic model in tertiary class A hospitals and primary hospitals was 0.875 and 0.921, respectively. The AUCs of the clinician diagnosis and the logistic model were 0.913 and 0.906. Their AUCs in the tertiary class A hospitals were 0.890 and 0.875, respectively, and were 0.924 and 0.921 in primary hospitals, respectively. Conclusions: The logistic regression model has better overall performance in primary hospitals, and the logistic regression model can be further extended to the basic level. A more balanced clinical prediction model can be further established on the premise of improving accuracy to assist clinicians in decision making and improve diagnosis.Trial Registration: http://www.clinicaltrials.gov. ClinicalTrials.gov ID: NCT03080623.