Evaluating the Diagnostic Potential of Connected Speech for Benign Laryngeal Disease Using Deep Learning Analysis

Jeong Hoon Lee,Jungirl Seok,Jae Yeong Kim,Hee Chan Kim,Tack-Kyun Kwon
DOI: https://doi.org/10.1016/j.jvoice.2024.01.015
IF: 2.3
2024-02-14
Journal of Voice
Abstract:Summary Objectives This study aimed to evaluate the performance of artificial intelligence (AI) models using connected speech and vowel sounds in detecting benign laryngeal diseases. Study design Retrospective. Methods Voice samples from 772 patients, including 502 with normal voices and 270 with vocal cord polyps, cysts, or nodules, were analyzed. We employed deep learning architectures, including convolutional neural networks (CNNs) and time series models, to process the speech data. The primary endpoint was the area under the receiver's operating characteristic curve for binary classification. Results CNN models analyzing speech segments significantly outperformed those using vowel sounds in distinguishing patients with and without benign laryngeal diseases. The best-performing CNN model achieved areas under the receiver operating characteristic curve of 0.895 and 0.845 for speech and vowel sounds, respectively. Correlations between AI-generated disease probabilities and perceptual assessments were more pronounced in the connected-speech analyses. However, the time series models performed worse than the CNNs. Conclusion Connected speech analysis is more effective than traditional vowel sound analysis for the diagnosis of laryngeal voice disorders. This study highlights the potential of AI technologies in enhancing the diagnostic capabilities of speech, advocating further exploration, and validation in this field.
otorhinolaryngology,audiology & speech-language pathology
What problem does this paper attempt to address?