Simplifying risk stratification for thyroid nodules on ultrasound: validation and performance of an artificial intelligence thyroid imaging reporting and data system

Benjamin Wildman-Tobriner,Jichen Yang,Brian C Allen,Lisa M Ho,Chad M Miller,Maciej A Mazurowski
DOI: https://doi.org/10.1067/j.cpradiol.2024.07.006
Abstract:Purpose: To validate the performance of a recently created risk stratification system (RSS) for thyroid nodules on ultrasound, the Artificial Intelligence Thyroid Imaging Reporting and Data System (AI TI-RADS). Materials and methods: 378 thyroid nodules from 320 patients were included in this retrospective evaluation. All nodules had ultrasound images and had undergone fine needle aspiration (FNA). 147 nodules were Bethesda V or VI (suspicious or diagnostic for malignancy), and 231 were Bethesda II (benign). Three radiologists assigned features according to the AI TI-RADS lexicon (same categories and features as the American College of Radiology TI-RADS) to each nodule based on ultrasound images. FNA recommendations using AI TI-RADS and ACR TI-RADS were then compared and sensitivity and specificity for each RSS were calculated. Results: Across three readers, mean sensitivity of AI TI-RADS was lower than ACR TI-RADS (0.69 vs 0.72, p < 0.02), while mean specificity was higher (0.40 vs 0.37, p < 0.02). Overall total number of points assigned by all three readers decreased slightly when using AI TI-RADS (5,998 for AI TI-RADS vs 6,015 for ACR TI-RADS), including more values of 0 to several features. Conclusion: AI TI-RADS performed similarly to ACR TI-RADS while eliminating point assignments for many features, allowing for simplification of future TI-RADS versions.
What problem does this paper attempt to address?