AI Detection of Glottic Neoplasm Using Voice Signals, Demographics, and Structured Medical Records

Chi‐Te Wang,Tsai‐Min Chen,Nien‐Ting Lee,Shih‐Hau Fang
DOI: https://doi.org/10.1002/lary.31563
IF: 2.97
2024-06-13
The Laryngoscope
Abstract:Objective This study investigated whether artificial intelligence (AI) models combining voice signals, demographics, and structured medical records can detect glottic neoplasm from benign voice disorders. Methods We used a primary dataset containing 2–3 s of vowel "ah", demographics, and 26 items of structured medical records (e.g., symptoms, comorbidity, smoking and alcohol consumption, vocal demand) from 60 patients with pathology‐proved glottic neoplasm (i.e., squamous cell carcinoma, carcinoma in situ, and dysplasia) and 1940 patients with benign voice disorders. The validation dataset comprised data from 23 patients with glottic neoplasm and 1331 patients with benign disorders. The AI model combined convolutional neural networks, gated recurrent units, and attention layers. We used 10‐fold cross‐validation (training–validation–testing: 8–1–1) and preserved the percentage between neoplasm and benign disorders in each fold. Results Results from the AI model using voice signals reached an area under the ROC curve (AUC) value of 0.631, and additional demographics increased this to 0.807. The highest AUC of 0.878 was achieved when combining voice, demographics, and medical records (sensitivity: 0.783, specificity: 0.816, accuracy: 0.815). External validation yielded an AUC value of 0.785 (voice plus demographics; sensitivity: 0.739, specificity: 0.745, accuracy: 0.745). Subanalysis showed that AI had higher sensitivity but lower specificity than human assessment (p
medicine, research & experimental,otorhinolaryngology
What problem does this paper attempt to address?