A proof-of-concept investigation into predicting follicular carcinoma on ultrasound using topological data analysis and radiomics

Andrew M. Thomas,Ann C. Lin,Grace Deng,Yuchen Xu,Gustavo Fernandez-Ranvier,Aida Taye,David S. Matteson,Denise Lee
DOI: https://doi.org/10.1101/2023.10.18.23297210
2024-07-10
Abstract:Background: Sonographic risk patterns identified in established risk stratification systems (RSS) may not accurately stratify follicular carcinoma from adenoma, which share many similar US characteristics. The purpose of this study is to investigate the performance of a multimodal machine learning model utilizing radiomics and topological data analysis (TDA) to predict malignancy in follicular thyroid neoplasms on ultrasound. Methods: This is a retrospective study of patients who underwent thyroidectomy with pathology confirmed follicular adenoma or carcinoma at a single academic medical center between 2010–2022. Features derived from radiomics and TDA were calculated from processed ultrasound images and high-dimensional features in each modality were projected onto their first two principal components. Logistic regression with L2 penalty was used to predict malignancy and performance was evaluated using leave-one-out cross-validation and area under the curve (AUC). Results: Patients with follicular adenomas (n=7) and follicular carcinomas (n=11) with available imaging were included. The best multimodal model achieved an AUC of 0.88 (95% CI: [0.85, 1]), whereas the best radiomics model achieved an AUC of 0.68 (95% CI: [0.61, 0.84]). Conclusions: We demonstrate that inclusion of topological features yields strong improvement over radiomics-based features alone in the prediction of follicular carcinoma on ultrasound. Despite low volume data, the TDA features explicitly capture shape information that likely augments performance of the multimodal machine learning model. This approach suggests that a quantitative based US RSS may contribute to the preoperative prediction of follicular carcinoma.
Radiology and Imaging
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: Existing Risk Stratification Systems (RSS) are unable to accurately distinguish between thyroid follicular adenomas and follicular carcinomas because these two lesions have many similar characteristics on ultrasound images. Therefore, the authors aim to explore whether a multi - modal machine - learning model using Topological Data Analysis (TDA) and Radiomics can improve the ability to predict follicular carcinomas based on ultrasound images. Specifically, the main objectives of the paper include: 1. **Improve prediction accuracy**: By combining TDA and radiomics features, construct a multi - modal machine - learning model that can more accurately distinguish between follicular adenomas and follicular carcinomas. 2. **Explore the application of new methods**: Verify the application potential of TDA in thyroid ultrasound images, especially its ability to capture shape information, in order to enhance the prediction performance of follicular carcinomas. 3. **Reduce unnecessary surgeries**: By improving preoperative prediction, reduce unnecessary surgeries for benign lesions, thereby optimizing the use of medical resources. ### Method overview - **Data collection**: A retrospective study was conducted on patients who underwent thyroidectomy between 2010 and 2022 in a single academic medical center and were pathologically diagnosed as having follicular adenomas or follicular carcinomas. - **Feature extraction**: - **Radiomics**: Extract high - dimensional features from processed ultrasound images. - **TDA**: Use tools such as Persistent Homology to extract topological features from images. - **Model development**: Use Logistic Regression combined with L2 penalty, and evaluate model performance through Leave - One - Out Cross - Validation. - **Result evaluation**: The main evaluation metric is the Area Under the Curve (AUC). ### Main findings - **Multi - modal model**: The multi - modal model combining TDA and radiomics features achieved the highest AUC value (0.88), which was significantly better than the model using only radiomics features (AUC = 0.68). - **Importance of TDA**: TDA features perform well in the case of small sample sizes, and are especially good at capturing shape information, which may be the key reason for its performance improvement. ### Conclusion The research shows that the introduction of TDA features can significantly improve the ability to predict follicular carcinomas based on ultrasound images, especially in cases where existing RSS have difficulty distinguishing between follicular adenomas and follicular carcinomas. This method provides new ideas for further optimizing the preoperative evaluation of thyroid nodules in the future.