Predicting BRAFV600E mutations in papillary thyroid carcinoma using six machine learning algorithms based on ultrasound elastography
Enock Adjei Agyekum,Yu-guo Wang,Fei-Ju Xu,Debora Akortia,Yong-zhen Ren,Kevoyne Hakeem Chambers,Xian Wang,Jenny Olalia Taupa,Xiao-qin Qian
DOI: https://doi.org/10.1038/s41598-023-39747-6
IF: 4.6
2023-08-05
Scientific Reports
Abstract:The most common BRAF mutation is thymine (T) to adenine (A) missense mutation in nucleotide 1796 (T1796A, V600E). The BRAF V600E gene encodes a protein-dependent kinase (PDK), which is a key component of the mitogen-activated protein kinase pathway and essential for controlling cell proliferation, differentiation, and death. The BRAF V600E mutation causes PDK to be activated improperly and continuously, resulting in abnormal proliferation and differentiation in PTC. Based on elastography ultrasound (US) radiomic features, this study seeks to create and validate six distinct machine learning algorithms to predict BRAF V6OOE mutation in PTC patients prior to surgery. This study employed routine US strain elastography image data from 138 PTC patients. The patients were separated into two groups: those who did not have the BRAF V600E mutation (n = 75) and those who did have the mutation (n = 63). The patients were randomly assigned to one of two data sets: training (70%), or validation (30%). From strain elastography US images, a total of 479 radiomic features were retrieved. Pearson's Correlation Coefficient (PCC) and Recursive Feature Elimination (RFE) with stratified tenfold cross-validation were used to decrease the features. Based on selected radiomic features, six machine learning algorithms including support vector machine with the linear kernel (SVM_L), support vector machine with radial basis function kernel (SVM_RBF), logistic regression (LR), Naïve Bayes (NB), K-nearest neighbors (KNN), and linear discriminant analysis (LDA) were compared to predict the possibility of BRAF V600E . The accuracy (ACC), the area under the curve (AUC), sensitivity (SEN), specificity (SPEC), positive predictive value (PPV), negative predictive value (NPV), decision curve analysis (DCA), and calibration curves of the machine learning algorithms were used to evaluate their performance. 1 The machine learning algorithms' diagnostic performance depended on 27 radiomic features. 2 AUCs for NB, KNN, LDA, LR, SVM_L, and SVM_RBF were 0.80 (95% confidence interval [CI]: 0.65–0.91), 0.87 (95% CI 0.73–0.95), 0.91(95% CI 0.79–0.98), 0.92 (95% CI 0.80–0.98), 0.93 (95% CI 0.80–0.98), and 0.98 (95% CI 0.88–1.00), respectively. 3 There was a significant difference in echogenicity,vertical and horizontal diameter ratios, and elasticity between PTC patients with BRAF V600E and PTC patients without BRAF V600E . Machine learning algorithms based on US elastography radiomic features are capable of predicting the likelihood of BRAF V600E in PTC patients, which can assist physicians in identifying the risk of BRAF V600E in PTC patients. Among the six machine learning algorithms, the support vector machine with radial basis function (SVM_RBF) achieved the best ACC (0.93), AUC (0.98), SEN (0.95), SPEC (0.90), PPV (0.91), and NPV (0.95).
multidisciplinary sciences