Thy-Wise: an Interpretable Machine Learning Model for the Evaluation of Thyroid Nodules

Zhe Jin,Shufang Pei,Lizhu Ouyang,Lu Zhang,Xiaokai Mo,Qiuying Chen,Jingjing You,Luyan Chen,Bin Zhang,Shuixing Zhang
DOI: https://doi.org/10.1002/ijc.34248
2022-01-01
Abstract:Current risk stratification systems for thyroid nodules suffer from low specificity and high biopsy rates. Recently, machine learning (ML) is introduced to assist thyroid nodule diagnosis but lacks interpretability. Here, we developed and validated ML models on 3965 thyroid nodules, as compared to the American College of Radiology Thyroid Imaging, Reporting and Data System (ACR TI-RADS). Subsequently, a SHapley Additive exPlanation (SHAP) algorithm was leveraged to interpret the results of the best-performing ML model. Clinical characteristics including thyroid-function tests were collected from medical records. Five ACR TI-RADS ultrasonography (US) categories plus nodule size were assessed by experienced radiologists. Random forest (RF), support vector machine (SVM) and extreme gradient boosting (XGBoost) were used to build US-only and US-clinical ML models. The ML models and ACR TI-RADS were compared in terms of diagnostic performance and unnecessary biopsy rate. Among the ML models, the US-only RF model (hereafter, Thy-Wise) achieved the optimal performance. Compared to ACR TI-RADS, Thy-Wise showed higher accuracy (82.4% vs 74.8% for the internal validation; 82.1% vs 73.4% for external validation) and specificity (78.7% vs 68.3% for internal validation; 78.5% vs 66.9% for external validation) while maintaining sensitivity (91.7% vs 91.2% for internal validation; 91.9% vs 91.1% for external validation), as well as reduced unnecessary biopsies (15.3% vs 32.3% for internal validation; 15.7% vs 47.3% for external validation). The SHAP-based interpretation of Thy-Wise enables clinicians to better understand the reasoning behind the diagnosis, which may facilitate the clinical translation of this model.
What problem does this paper attempt to address?